Logical Methods in Computer Science 
Vol. 4 (4:11) 2008, pp. 1-44 
www.lmcs-online.org 



Submitted Jan. 17, 2008 
Published Nov. 25, 2008 



FIRST-ORDER AND TEMPORAL LOGICS FOR NESTED WORDS 

RAJEEV ALUR", MARCELO ARENAS \ PABLO BARCEL6^ KOUSHA ETESSAMI'', 
NEIL IMMERMAN^ AND LEONID LIBKIN-'' 

" Department of Computer and Information Science, University of Pennsylvania 
e-mail address: alur@cis.upenn.edu 

* Department of Computer Science, Pontificia Universidad Catolica de Chile 
e-mail address: marenas@ing.puc.cl 

Department of Computer Science, Universidad de Chile 
e-mail address: pbarcelo@dcc.uchile.cl 

School of Informatics, University of Edinburgh, Edinburgh 
e-mail address: kousha@inf.ed.ac.uk 

Department of Computer Science, University of Massachusetts 
e-mail address: immerman@cs.umass.edu 

^ School of Informatics, University of Edinburgh, Edinburgh 
e-mail address: libkin@inf.ed.ac.uk 



Abstract. Nested words are a structured model of execution paths in procedural pro- 
grams, reflecting their call and return nesting structure. Finite nested words also capture 
the structure of parse trees and other tree-structured data, such as XML. 

We provide new temporal logics for finite and infinite nested words, which are natural 
extensions of LTL, and prove that these logics are first-order expressively-complete. One 
of them is based on adding a "within" modality, evaluating a formula on a subword, to 
a logic CaRet previously studied in the context of verifying properties of recursive state 
machines (RSMs). The other logic, NWTL, is based on the notion of a summary path 
that uses both the linear and nesting structures. For NWTL we show that satisfiability 
is EXPTIME-complete, and that model-checking can be done in time polynomial in the 
size of the RSM model and exponential in the size of the NWTL formula (and is also 
EXPTIME-complete) . 

Finally, we prove that first-order logic over nested words has the three- variable property, 
and we present a temporal logic for nested words which is complete for the two-variable 
fragment of first-order. 
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1. Introduction 

An execution of a procedural program can reveal not just a linear sequence of program 
states encountered during the execution, but also the correspondence between each point 
during the execution at which a procedure is called and the point when we return from 
that procedure call. This leads naturally to the notion of a finite or infinite nested word 
(see [H O [2]). A nested word is simply a finite or cj-word supplied with an additional 
binary matching relation which relates corresponding call and return points (and of course 
satisfies "well-bracketing" properties). Finite nested words offer an alternative way to view 
any data which has both a sequential string structure as well as a tree-like hierarchical 
structure. Examples of such data are XML documents and parse trees. 

Pushdown systems (PDSs), Boolean Programs, and Recursive State Machines (RSMs), 
are equivalent abstract models of procedural programs, with finite data abstraction but 
unbounded call stack. Software model checking technology is by now thoroughly developed 
for checking w-regular properties of runs for these models, when the runs are viewed as 
ordinary words (see OEKI]). Unfortunately, temporal logic and w-regular properties over 
ordinary words are inadequate for expressing a variety of properties of program executions 
that are useful in interprocedural program analysis and software verification. These include 
Hoare-like pre/post conditions on procedures, stack inspection properties, and other useful 
program analysis properties that go well beyond w-regular (see [2] for some examples). On 
the other hand, many such program analysis properties can easily be expressed when runs 
are viewed as nested words. Runs of Boolean Programs and RSMs can naturally be viewed 
as nested words once we add "summary edges" between matching calls and returns, and we 
can thus hope to extend model checking technology for procedural programs using richer 
temporal logics over nested words which remain tractable for analysis. 

These considerations motivated the definition of Visibly Pushdown Languages (VPLs) 
[3] and the call-return temporal logic CaRet [2]. CaRet is a temporal logic over nested 
wor dS which extends LTL with new temporal operators that allow for navigation through 
a nested word both via its ordinary sequential structure, as well as its matching call-return 
summary structure. The standard LTL model checking algorithms for RSMs and PDSs 
can be extended to allow model checking of CaRet, with essentially the same complexity 
[2]. VPLs [3] are a richer class of languages that capture MSO-definable properties of 
nested words. Recently, results about VPLs have been recast in light of nested words, and 
in particular in terms of Nested Word Automata [1] which offer a machine acceptor for 
(a;-)regular nested words, with all the expected closure properties. 

Over ordinary words, LTL has long been considered the temporal logic of choice for 
program verification, not only because its temporal operators offer the right abstraction for 
reasoning about events over time, but because it provides a good balance between expres- 
siveness (first-order complete), conciseness (can be exponentially more succinct compared 
to automata), and the complexity of model-checking (linear time in the size of the finite 
transition system, and PSPACE in the size of the temporal formula). 

This raises the question: What is the right temporal logic for nested words? 

The question obviously need not have a unique answer, particularly since nested words 
can arise in various application domains: for example, program verification, as we already 
discussed, or navigation and querying XML documents under "sequential" representation 
(see, e.g., [28]). However, it is reasonable to hope that any good temporal logic for nested 
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words should possess the same basic quahties that make LTL a good logic for ordinary 
words, namely: 

(1) first-order expressive completeness: LTL has the same expressive power as first-order 
logic over words, and we would want the same over nested words (of course, even 
more expressiveness, such as full MSO, would be nice but natural temporal logics are 
subsumed by first order logic and any further expressiveness typically comes at a cost, 
even over words, of some other desirable properties); 

(2) reasonable complexity for model checking and satisfiability; and 

(3) nice closure properties: LTL is closed under boolean combinations including negation 
without any blow-up, and we would want the same for a logic over nested words. 

Finally (and perhaps least easy to quantify), we want 

(4) natural temporal operators with simple and intuitive semantics. 

Unfortunately, the logic CaRet appears to be deficient with respect to some of these 
criteria: although it is easily first-order expressible, it is believed to be incomplete but prov- 
ing incompleteness appears to be difficult. CaRet can express program path properties (for 
example, every lock operation is eventually followed by an unlock operation) and local path 
properties (for example, if a procedure executes a lock operation then the same procedure 
later executes an unlock operation before returning), but it seems incapable of expressing 
scope-bounded path properties (for example, every lock operation in a procedure is even- 
tually followed by an unlock operation before the procedure returns). Such scope-bounded 
path properties are natural program requirements, and are expressible in the first-order 
logic of nested words. There is much related work in the XML community on logics for 
trees (see, e.g., surveys [El [HI [29] ) , but they tend to have different kinds of deficiencies for 
our purposes: they concentrate on the hierarchical structure of the data and largely ignore 
its linear structure; also, they are designed for finite trees. 

We introduce and study new temporal logics over nested words. The main logic we 
consider. Nested Word Temporal Logic (NWTL) extends LTL with both a future and past 
variant of the standard Until operator, which is interpreted over summary paths rather 
than the ordinary linear sequence of positions. A summary path is the unique shortest 
directed path one can take between a position in a run and some future position, if one is 
allowed to use both successor edges and matching call-return summary edges. We show that 
NWTL possesses all the desirable properties we want from a temporal logic on nested words. 
In particular, it is both first-order expressively complete and has good model checking 
complexity. Indeed we provide a tableaux construction which translates an NWTL formula 
into a Nested Word Automaton, enabling the standard automata theoretic approach to 
model checking of Boolean Programs and RSMs with complexity that is polynomial in the 
size the model and EXPTIME in the size of the formula (and indeed EXPTIME-complete). 

We then explore some alternative temporal logics, which extend variants of CaRet with 
variants of unary "Within" operators proposed in [2], and we show that these extensions are 
also FO-complete. However, we observe that the model checking and satisfiability problems 
for these logics are 2EXPTIME-complete. These logics are - provably - more concise than 
NWTL, but we pay for conciseness with added complexity. 

It follows from our proof of FO-completeness for NWTL that over nested words, every 
first-order formula with one free variable can be expressed using only 3 variables. More 
generally, we show, using EE games, that 3 variables suffice for expressing any first order 
formula with two or fewer free variables, similarly to the case of words |13j or finite trees 
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|20j . Finally, we show that a natural unary temporal logic over nested words is expressively 
complete for first-order logic with 2 variables, echoing a similar result known for unary 
temporal logic over ordinary words [9]. 

Related Work. VPLs and nested words were introduced in [Sjll]- The logic CaRet was 
defined in [2] with the goal of expressing and checking some natural non-regular program 
specifications. The theory of VPLs and CaRet has been recast in light of nested words in 
[1]. Other aspects of nested words (automata characterizations, games, model-checking) 
were further studied in [H |3l O |T7] . It was also observed that nested words are closely 
related to a sequential, or "event-based" API for XML known as SAX [25] (as opposed to a 
tree-based DOM API [7]). SAX representation is very important in streaming applications, 
and questions related to recognizing classes of nested words by the usual word automata 
have been addressed in [281 E] • 

While finite nested words can indeed be seen as XML documents under the SAX rep- 
resentation, and while much effort has been spent over the past decade on languages for 
tree-structured data (see, e.g., [HI [Ml [29] for surveys), adapting the logics developed for 
tree-structured data is not as straightforward as it might seem, even though from the com- 
plexity point of view, translations between the DOM and the SAX representations are easy 
[27j . The main problem is that most such logics rely on the tree-based representation and 
ignore the linear structure, making the natural navigation through nested words rather un- 
natural under the tree representation. Translations between DOM and SAX are easy for 
first-order properties, but verifying navigational properties expressed in first-order is neces- 
sarily non-elementary even for words if one wants to keep the data complexity linear [TO] . 
On the other hand, logics for XML tend to have good model-checking properties (at least 
in the finite case) , typically matching the complexity of LTL |11[ [22] . We do employ such 
logics (e.g., those in [191 [201 [26] ) iii the proof of the expressive completeness of NWTL, first 
by using syntactic translations that reconcile both types of navigation, and then by com- 
bining them with a composition game argument that extends the result to the infinite case, 
which is not considered in the XML setting. This, however, involves a nontrivial amount of 
work. Furthermore, "within" operators do not have any natural analog on trees, and the 
proof for them is done by a direct composition argument on nested words. 

Organization. Basic notations are given in Section [2l Section [3| defines temporal logics 
on nested words, and Section [3| presents expressive completeness results. We study model- 
checking in Section [5l and in Section [6| we prove the 3- variable property and present a logic 
for the 2- variable fragment. 

2. Notations 

2.1. Nested Words. A matching on N or an interval [1, n] of N consists of a binary relation 
and two unary relations call and ret, satisfying the following: (1) if /i(i,j) holds then 
call(z) and ret(j) and i < j; (2) if lJ.{i,j) and fJ.{i,j') hold then j = j' and if niijj) and 
,j) hold then i = i'; (3) if i < j and call(i) and ret(j) then there exists i < k < j such 
that either ^J-{i,k) or fi{k,j). 

Let S be a finite alphabet. A finite nested word of length n over S is a tuple w = 
(w;, /i, call, ret), where w = ai . . . Un T,* , and (/i, call, ret) is a matching on [l,n]. A 
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nested co-word is a tuple w = {w, ^, call, ret), where w = ai . . . G S*^, and (/.f, call, ret) 
is a matching on N. 

We say that a position i in a nested word -zD is a call position if call(i) holds; a return 
position if ret(i) holds; and an internal position if it is neither a call nor a return. If 
holds, we say that i is the matching call of j, and j is the matching return of i, and 
write c{j) = i and r(i) = j. Calls without matching returns are pending calls, and returns 
without matching calls are pending returns (sometimes we will alternatively refer to such 
calls and returns as unmatched). A nested word is said to be well-matched if no calls or 
returns are pending. Note that for well-matched nested words, the unary predicates call 
and ret are uniquely specified by the relation ^. 

A nested word w = {w, fi, call, ret) is represented as a first-order structure 

( U , (Pa)aeE , < , , call , ret ), 

where U is {1, . . . ,n} w is a finite word of length n and N if ?Z; is a nested w-word; < is 
the usual ordering, Pa is the set of positions labeled a, and (/i, call, ret) is the matching 
relation. When we talk about first-order logic (FO) over nested words, we assume FO over 
such structures (i.e. the vocabulary is that of words plus the matching relation). 

For a nested word w, and two elements i,j of w, we denote by wlijj] the substructure 
of id (i.e. a finite nested word) induced by elements i such that i < i < j- If j < i we 
assume that w[i,j] is the empty nested word. For nested cj- words w, we let u;[i,oo] denote 
the substructure induced by elements I > i. 

When this is clear from the context, we do not distinguish references to positions in 
subwords j] and id itself, e.g., we shall often write {w[i,j], i) \= ip to mean that (p is true 
at the first position of w\i,j]. 

Figure [1] shows two finite nested words (without the labeling with alphabet sym- 
bols). Nesting edges are drawn using dashed lines. For the first word, the relation ^ is 
{(2, 8), (4, 7)}, the set call is {2,4}, and the set ret is {7,8}. For the second word, the 
relation ^ is {(2,3)}, the set call is {2,5,7}, and the set ret is {1,3,4}. 

Note that our definition allows a nesting edge from a position i to its linear successor, 
and in that case there will be two edges from i to i-|-l; this is the case for positions 2 and 3 of 
the second sequence. The second sequence has two pending calls and two pending returns. 
Pending calls are depicted by dashed outgoing edges and pending returns are depicted by 
dashed incoming edges. Note that all pending return positions in a nested word appear 
before any of the pending call positions (this is enforced by condition (3) of the definition 
of matchings). 

2.2. Games and types. The quantifier rank (or quantifier depth) of an FO formula 93 is 
the depth of quantifier nesting in ip. The rank-k type of a structure 9Jt over a relational 
vocabulary is the set {(^ | S!Jt ^ and the quantifier rank of (p is k}, where (p ranges over 
FO sentences over the vocabulary. It is well-known that there are finitely many rank-A: 
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types for all k, and for each rank-fe type r there is an FO sentence ipr such that DJt \= tpr 
the rank-/c type of 9Jt is r. Sometimes we associate types with formulas that define them. 

Many proofs in this paper make use of Ehrenfeucht-Frai'sse (EF) games, see for example 
[12]. This game is played on two structures, Tl and 9Jt', over the same vocabulary, by two 
players, Player I and Player II. In round i Player I selects a structure, say 9H, and an 
element q in the domain of 971; Player II responds by selecting an element in the domain 
of dJt'. Player II wins in k rounds, for /c > 0, if {(cj, e^) \ i < k} defines a partial isomorphism 
between SOT and dJt'. Also, if a is an jn-tuple in the domain of 9Jt and b is an ?n-tuple in 
the domain of W, where m > 0, we write (SOT, a) =fc (9Jt',6) whenever Player II wins in k 
rounds no matter how Player I plays, but starting from position (a, 6). 

We write S!Jt =fc SOT' iff 9Jt and W have the same rank-A: type, that is for every FO 
sentence cp of quantifier rank-fe, VJl \= (p 4^ Tl' \= (p. It is well-known that Tl =k SDT' iff 
Player II has a winning strategy in the /c-round Ehrenfeucht-Frai'sse game on dJt and dJt' . 

In the proof of Theorem 16. 11 we shall also use k-pebble games. In such a game, Player 

I and Player II have access to k matching pebbles each, and each round consists of Player I 
either removing, or placing, or replacing a pebble in one structure, and Player II replicating 
the move in the other structure. The correspondence given by the matching pebbles should 
be a partial isomorphism. If Player II can play while maintaining partial isomorphism for m 
rounds, then the structures agree on all FO*^ sentences of quantifier rank up to m; if Player 

II can play while maintaining partial isomorphism forever, then the structures agree on all 
FO'^ sentences. (FO'^ is first-order logic where at most k distinct variables may occur.) 

3. Temporal Logics over Nested Words 

We now describe our approach to temporal logics for nested words. It is similar to the 
approach taken by the logic CaRet |2] . Namely, we shall consider LTL-like logics that define 
the next /previous and until/since operators for various types of paths in nested words. 

All the logics will be able to refer to propositional letters, including the base unary 
relations call and ret, and will be closed under all Boolean combinations. We shall write 
T for true and _L for false. For all the logics, we shall define the notion of satisfaction with 
respect to a position in a nested word: we write {w,i) \= ip to denote that the formula (p is 
true in position i of the word w. 

Since nested words are naturally represented as transition systems with two binary 
relations - the successor and the matching relation - in all our logics we introduce next 
operators O and O^- The semantics of those is standard: {w,i) \= Oip iff {w,i + 1) \= ^, 
(wji) \= 0^(p iff i is a call with a matching return j (i.e., fJ^{i,j) holds) and {w,j) \= (p. 
Likewise, we shall have past operators and 0^: that is, Qip is true in position i > 1 
iff (p is true in position i — 1, and Q^ip is true in position j if j is a return position with 
matching call i and ip is true at i. 

3.1. Paths in Nested Words. The until/since operators depend on what a path is. In 
general, there are various notions of paths through a nested word. We shall consider un- 
til/since operators for paths that are unambiguous: that is, for every pair of positions i and 
j with i < j, there could be at most one path between them. Then, with respect to any 
such given notion of a path, we have the until and since operators with the usual semantics: 
• {w,i) \= ipVip iff there is a position j > i and a path i = iq < ii < . . . < ik = j between 
them such that {w,j) \= ip and {iu, ip) \= ip for every < p < k. 
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• (wji) \= ipStp iff tliere is a position j < i and a patli j = io < ii < . . . < = i between 
tliem sucti tfiat {w,j) \= ip and {w, ip) \= ip for every Q < p <k. 

The approach of CaRet was to introduce three types of paths, based on the hnear suc- 
cessor (caUed linear paths), the call-return relation (called abstract paths), and the innermost 
call relation (called call paths). 

To define those, we need the notions C{i) and lZ{i) for each position i - these are the 
innermost call within which the current action i is executed, and its corresponding return. 
Formally, C{i) is the greatest matched call position j < i whose matching return is after i 
(if such a call position exists), and TZ{i) is the least matched return position £ > i whose 
matching call is before i. 

Definition 3.1 (Linear, call and abstract paths). Given two positions i < j, a sequence 
i = io < ii <...< ik = j is 

• a linear path if ip^i = ip + 1 for all p < k; 

• a call path if ip = C{ip+i) for all p < k; 

• an abstract path if 



We shall denote until/since operators corresponding to these paths by U/S for linear paths, 
-y-c^gc £qj, ^g^Yl paths, and U"/S°' for abstract paths. 

Our logics will have some of the next /previous and until/since operators. Some exam- 



• When we restrict ourselves to the purely linear fragment, our operators are O and 0, 
and U and S, i.e., precisely LTL (with past operators). 

• The logic CaRet [2] has the following operators: the next operators O and O^; the linear 
and abstract untils (i.e., U and U"), the call since (i.e., S'^) and a previous operator Qc, 
defined by: {w,i) \= Qc^ iff C{i) is defined and {w,C{i)) \= ip. 

Another notion of a path combines both the linear and the nesting structure. It is the 
shortest directed path between two positions i and j. Unlike an abstract path, it decides 
when to skip a call based on position j. Basically, a summary path from i to j moves along 
successor edges until it finds a call position k. If k has a matching return ^ such that j 
appears after then the summary path skips the entire call from k to I and continues from 
t, otherwise the path continues as a successor path. Note that every abstract path is a 
summary path, but there are summary paths that are not abstract paths. 

Definition 3.2. A summary path between i < j m. a nested word w \s a sequence i = io < 
ii < . . . < ik = j such that for all p < k, 



{r{ip) if ip is a matched call and j > r{ip) 
ip + 1 otherwise 

The corresponding until/since operators are denoted by U*^ and S"^. 

We will also consider two special kinds of summary paths: summary- down paths are 
allowed to use only call edges (from a call position, i to i + 1 where i + 1 is not a return), 
nesting edges (from a call to its matching return) , and internal edges (from some i to i -|- 1 
where i is not a call and i -|- 1 is not a return), and summary-up paths are allowed to use 




r{ip) if ip is a matched call 

+ 1 if ip is not a call and ip -|- 1 is not a matched return. 



pies are: 
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only return edges (from a position preceding a return to the return), nesting edges and 
internal edges. (In other words, summary-down paths are summary paths with no return 
edges and summary-up paths are summary paths with no call edges.) 

We will use U'^-^ and U*^^ to denote the corresponding until operators. A general 
summary path is a concatenation of a summary-up path and summary-down path: ipU'^i/j 
is equivalent to ipV'^'^ (cpU'^^Tp). 

We will also study the expressiveness of various until modalities when the logic is 
extended with the within operator, which allows restriction to a subword. If 93 is a formula, 
then yV(p is a formula, and {w,i) \= Wip iff i is a call, and {w[i,j],i) \= tp, where j = r{i) if 
i is a matched call, j = \w\ if i is an unmatched call and id is finite, and j = 00 otherwise. 
In other words, yV(p evaluates ip on a subword restricted to a single procedure. 

To understand the various notions of paths in a nested word, let us consider the left 
word shown in Figure [1] again. An abstract path uses internal and nesting edges; for 
example, (1,2,8,9) and (3,4,7) are abstract paths. Summary-down paths, in addition, can 
use call edges; for example, (1,2,3,4,7) is a summary-down (but not an abstract) path. 
Summary-up paths can use internal and nesting edges, and can also go along return edges; 
for example, (3, 4, 7, 8, 9) is a summary-up path. A summary path is a summary-up path 
followed by a summary-down path; for example, (3,4,5,6,7) in the right word of Figured] 
is a summary path (which also happens to be a linear path). Every two positions have a 
unique summary path connecting them, and this is the "shortest" path in the underlying 
graph between these positions. 

3.2. Specifying Requirements. We now discuss how the various operators can be used 
for specifying requirements for sequential structured programs. In the classical linear-time 
semantics of programs, an execution of a program is modeled as a word over program states. 
In the nested-word semantics, this linear structure is augmented with nesting edges from 
entries to exits of program blocks. The main benefit is that using nesting edges one can skip 
procedure calls entirely, and continue to trace a local path through the calling procedure. A 
program is now viewed as a generator of nested words, and requirements are written using 
temporal logics over nested words. 

Suppose we want to express the requirement that, along a global program execution, 
every write to a variable is followed by a read before the variable is written again. If wr 
and rd denote the atomic propositions that capture write and read operations, respectively, 
then the requirement is expressed by the until formula over linear paths, 

□ [ wr — > (-1 wr) U rd ] 
Here, □ is defined in the usual manner from the linear until: 0{p stands for ^(T JJ^ip). 
This property is clearly already expressible in LTL and does not use nesting edges at all. 

Now let us review some of the properties expressible in the nested call-return logic 
CaRet of [2], but not expressible in LTL. In the classical verification formalisms such as 
Hoare logic, correctness of procedures is expressed using pre and post conditions. Partial 
correctness of a procedure A specifies that if the pre-condition p holds when the procedure 
A is invoked, then if the procedure terminates, the post-condition q is satisfied upon return. 
Total correctness, in addition, requires the procedure to terminate. Assume that all calls 
to the procedure A are characterized by the proposition pA- Then, the requirement 

□ [(call A p A pa) ^Of,q] 
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expresses the total correctness, while 

a [(O^T A p A pa) ^0^^q] 

expresses the partial correctness. Both these specifications crucially rely upon the abstract- 
next operator. 

The abstract path starting at a position inside a procedure A is obtained by successive 
applications of internal and nesting edges, and skips over invocations of other procedures 
called from A. Using the abstract versions of temporal operators, we can specify properties 
of such abstract paths. For example, suppose we want to specify that if a procedure writes 
to a variable, then it (that is, the same invocation of the same procedure) will later read 
it and do so before writing to it again. The requirement is expressed by the until formula 
over abstract paths 

a [wr ^ wr ) IJ"- rd] 

The call since-path starting at a position inside a procedure A is obtained by successively 
jumping to the innermost call positions, and encodes the active stack at that position. Stack 
inspection can specify a variety of security properties. For instance, the requirement that a 
procedure A should be invoked only within the context of a procedure B, with no intervening 
call to an overriding module C, is expressed by the formula 

□ [ call A PA ^ hpc) S^PB ]. 

Finally, we turn to scope-bounded linear-path properties. For a procedure, the cor- 
responding scope-bounded linear path is the linear path (that is, the path obtained by 
following linear edges) from its call to it return. That is, a scope-bounded path correspond- 
ing to a procedure P includes the executions of the procedures (transitively) called by P, 
but terminates when the current invocation of P returns. Properties about scope-bounded 
paths are useful in asserting contracts for modules. 

Suppose we want to assert that a procedure A, and the procedures it calls, do not 
write to a variable before it returns. This is an invariant of the scope-bounded path, and is 
captured by the formula: 

□ [ (call A pa) W {a ^ wr )] 

Recall that the within operator W restricts the evaluation of a formula to a single procedure 
call. The same requirement can also be captured using summary paths. It is even easier to 
state it using summary-down paths: 

□ [ (call A pa) ^ ( T wr ) ] 

Suppose we want to specify the requirement that if a procedure writes to a variable 
then it is read along the scope-bounded path before being written again. We can use the 
within modality to express this property: 

□ [ call W a {wr 0(^ wr) JJ rd ) ] 

This requirement can also be alternatively specified using summary-down paths as follows: 

n [ wr ^ 0{ ^ wr A (ret G^.^TU'"^ wr ) ) U'^^ rd ] 

The formula says that from every write operation, there is a read operation along some 
summary-down path (and thus, within the same scope) such that along the path, there is 
no write, and if the path uses a summary edge, then the enclosed subword also does not 
contain a write. 
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It is easy to see that the above requirements concerning scope-bounded paths are speci- 
fiable in the first-order logic of nested words. It is conjectured that they are not specifiable 
in CaRet. 

4. Expressive Completeness 

In this section, we study logics that are expressively complete for FO, i.e. temporal logics 
that have exactly the same power as FO formulas with one free variable over finite and 
infinite nested words. In other words, for every formula (p of an expressively complete 
temporal logic there is an FO formula ^'{x) such that {w^i) \= ip ]S. w \= (p'{i) for every 
nested word id and position i in it, and conversely, for every FO formula ^l^{x) there is a 
temporal formula iJj' such that id \= ip{i) iff \= ip' . 

Our starting point is a logic NWTL (nested-word temporal logic) based on summary 
paths introduced in the previous section. We show that this logic is expressively complete 
for FO, and of course remains expressively complete with the addition of other first-order 
expressible operators which may be useful for verification of properties of procedural pro- 
grams. When we provide upper bounds on the complexity of model checking for NWTL, 
we shall in fact show that the upper bounds hold with respect to an extension, NWTL"^, 
which includes a number of additional operators. 

We then look at logics close to those in the verification literature, i.e., with operators 
such as call and abstract until and since, and ask what needs to be added to them to get 
expressive completeness. We confirm a conjecture of [2] that a within operator is sufficient. 
Such an operator evaluates a formula on a nested subword. We then discuss the role of this 
within operator. We show that, if added to NWTL, it does not increase expressiveness, but 
makes the logic exponentially more succinct. 

4.1. Expressive completeness and NWTL. The logic NWTL {nested words temporal 
logic) has next and previous operators, as well as until and since with respect to summary 
paths. That is, its formulas are given by: 

<p,(p' := T I a I call | ret | ^(p \ (pV (p' \ 
Op I Ofi-P I 0V? I Qfi-P I 
ipVp' I ipS^ip' 

where a ranges over S. We use abbreviations int for -icall A -iret (true in an internal 
position). Note that in the absence of pending calls and returns, call and ret are definable 
as O/jT and 0^T, respectively. 

Theorem 4.1. NWTL = FO over both finite and infinite nested words. 
Proof. We start with the easy direction NWTL C FO. 

Lemma 4.2. For every NWTL formula ip, there exists an FO formula a^p^x) that uses at 
most three variables x,y,z such that for every nested word w (finite or infinite), and every 
position, i in w, we have {w,i) \= ip iff w \= a^p{i). 

Proof of Lemma \4-^ The proof is by induction on the formulas and very simple for all the 
cases except U"^ and S"^: for example, 

^Ouf(^) = 3y (M(2;,y) A 3x (x = ?/ Aq<^(x))). 
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For translating U'^, we need a few auxiliary formulas. Our first goal is to define a 
formula ^r{x, z) saying that x is TZ{z), i.e. the return of the innermost call within which z 
is executed. For that, we start with 5{y, z) = z < y A ret(y) A 3x {fi{x, y) Ax < z) saying 
that y is a return that is preceded by z and whose matching call precedes z, that is, y is a 
candidate for TZ{z). Then the formula 7r(x, z) is given by 

3y {y = x A 6{y, z)) A \fy {6{y, z) y > x). 

Likewise, we define 7c(y, z) stating that that y equals C{z)^ that is, the innermost call within 
which z is executed. Now define 

Xi{y,z) = 3x {-fr{x,z) Ax <y), X2{x,z) = 3y {'y^y, z) A y > x) 

and x(x, y, z) as Xiiu^ z) A X2{x, z). Then this formula says that the summary path from x 
to y does not pass through z, assuming x < z < y. With this, a,pU'y^{x) is given by 

a^{x) y 3y ^ > X A q;<^(x) A3x {x = y A a^{x)) A 

Mz ((x < z < y A -^x{x, y, z)) — > 3x (x = 2 A a^{x)))^ 

The proof for ipS^ip is similar. This concludes the proof of the lemma. □ 

In the proof of the other direction, FO C NWTL, we shall use a tree representation of 
nested words. The translation is essentially the same as in [3]. For each nested word iv we 
have a binary tree Tyj (i.e., its nodes are elements of {0, 1}*) and a function ^w-t ■ w ^ T^, 
that maps each position of fZ) to a node of T^jj as follows: 

• the first position of w is mapped into the root of T^,; 

• if s = tw-t(^) then: 

(1) if i is an internal, or an unmatched call, or a matched call whose return is the last 
position of w, or an unmatched return, and i is not the last position of w, then s has 
only child s ■ and /-w-t(^ + 1) = s • 0; 

(2) if i is a matched call whose return is not the last position in w, then s has both 
children s ■ and s • 1 and Lw-tifii) + 1) = s • 0, and i^-tii + 1) = s • 1. 

(3) if i is a matched return, then s has no children. 

The S-labels of i and tw-t(^) are the same. If i was a pending call, we label ^w-t(^) with 
pcall, and if i was a pending return, we label /-w-t(^) with pret. 

Note that iw_t is a bijection, and that labels pcall and pret may only occur on the 
leftmost branch of T^, ■ An example of a nested word and its translation are given in Fig. [2j 

To relate paths in nested words and paths in their tree translations, we introduce the 
notions of semi-strict and strict paths. Intuitively, a semi-strict path in a nested word 
corresponds to a path on its tree translation that, in addition to following tree edges, can 
jump from a node with no children to its successor in the depth-first traversal of the tree 
(where depth- first starts with the right subtree and then moves to the left subtree) . A strict 
path is just a path that follows tree edges. These are both slight modifications of summary 
paths. 

More precisely, a semi-strict path between positions i and j, with i < j, in a nested 
word w, is a sequence i = io < ii < ■ ■ ■ < = j such that 

{r{ip) + 1 if ip is a matched call and j > r{ip) 
i„ + 1 otherwise. 
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Figure 2: A nested word and its tree translation 



That is, when skipping a call, instead of jumping to the matching return position, a semi- 
strict path will jump to its successor. 

A strict path is a semi-strict path i = iq < ii < 12 < ■ ■ ■ < ifc = j in which no ip 
with p < A; is a matched return position. In other words, a strict path stops if it reaches a 
matched return position. In particular there may be positions i < j in a nested word such 
that no strict path exists between them. 

For example, in Fig. [21 (2, 4, 5, 6) is a semi-strict path. Although (2, 4, 5, 6) is not a 
path in the tree (we jump from 5 to 6), this is allowed under the definition of semi-strict 
paths. Strict paths are exactly the paths on the tree; for example, (1, 2, 4, 5) is such a path. 

The until/since operators for semi-strict paths and strict paths will be denoted by 
U^^/SJs and U^/S^, respectively. Versions of NWTL in which V/S'' are replaced by 
V^s/^ss (U^/Ss) will be denoted by NWTL"" (NWTL"). 

We will use mret for ret A Q^T, and mcall for call A O^jT, to capture matching 
return and call positions, respectively. 

The proof is based on two lemmas. 

Lemma 4.3. NWTL" C NWTL"" C NWTL. 

Lemma 4.4. FO C NWTL". 

This of course implies the theorem: NWTL C FO C NWTL" C NWTL"" C NWTL. 
Note that as a corollary we also obtain NWTL" = NWTL"" = FO. 

Proof of Lemma \4-3\ For translating an NWTL" formula (p into an equivalent formula 
of NWTL"" we need to express tpTJ^O with U^g, which is simply (a^ A -^m.ret)\Jggag, and 
likewise for the since operators. For translating each NWTL"" formula into an equivalent 
NWTL formula /3<^, again we need to consider only the case of until/since operators. The 
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formula ipVss^ is translated into 
Pe V (^P^ A (^(^{P^, V ret) A (-mcall ^ 0/3^) A (mcall ^ (0^0,9^ V 0;.0/3e))) U'^ 

(^{P^ V ret) A (-mcall ^ OPe) A (mcall ^ (0/3^ V 0^.0/30 V 0(-ret A 7)))^ ^ ^ , (4.1) 
where 7 is a formula defined as follows: 

(/3^ V ret) A (-mcall ^ 0/3^) A (mcall ^ (O/Sv^ V 0^0/3v.)) A (Oret call)^ U'" 
{p^ V ret) A (-mcall ^ 0/3e) A (mcall ^ {OPe V 0^0/3e)) 



The idea is that we split a semi-strict path into a semi-strict up path (where call edges 
are excluded) followed by a semi-strict down path (where return edges are excluded). The 
first Until in ()4.ip captures the semi-strict up path and the second Until in 7 captures the 
semi-strict down path. The translation for S^g is similar. 

The proof that the translation is correct is a rather detailed case analysis which we 
have relegated to the appendix. □ 



Proof of Lemma \4-4\ We start with the finite case, and then show how the inclusion extends 
to nested w-words. 

As a tool we shall need a slight modification of a result from \26\ [T9] providing an 
expressively complete temporal logic for trees with at most binary branching. We consider 
binary trees whose domain is a prefix-closed subset of {0, 1}*, and we impose a condition 
that if s • 1 S D then s ■ G D. When we refer to FO on trees, we assume they have two 
successor relations Sq, Si and the descendant relation ^ (which is just the prefix relation 
on strings) plus the labeling predicates, which include two new labels pcall and pret (for 
pending calls and returns). Each node can be labeled by either a letter from S, or by a 
letter from S and pcall, or by a letter from S and pret (i.e. labels pcall and pret need 
not be disjoint from other labels). 

We also consider the following logic TL^^'^^: 

(fi := a I (/5 V (/5 I I 

Oup I Chip I O-^V' I O^V' I 
(pVnp I ipSiip 

where a ranges over S U {pcall, pret}, with the following semantics: 

• (T, s) \= Oiip iff {T,s-i)\=ip for some i e {0, 1}; 

• (T, s • i) \= 0]-(p iff (T, s) \= if (where i is either or 1); 
. (r,s-0) ^0-^V?iff (r,s-l) ^ip; 

. (T, s • 1) ^ O^ip iff (T, s-0)^ip; 

• (T, s) 1= V7U|i/; iff there exists s' such that s ^ s' , (T, s') ^ V) ^^'^ {T, s") \= (p for all s" 
such that s ^ s" -< s'; 

• (T, s) \= ip^i^) iff there exists s' such that s' ^ s, (T, s') \= ip, and (T, s") \= (p for all s" 
such that s' ~< s" < s. 

Lemma 4.5. (see [l9]j For unary queries over finite binary trees, TL*'^°° = FO. 
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This lemma is an immediate corollary of expressive completeness of logic XunUi from [19] 
on ordered unranked trees, as for a fixed number of siblings, the until and since operators 
can be expressed in terms of the next and previous operators. The result of [19] applies to 
arbitrary alphabets, and thus in particular to our labeling that may use pcall and pret. 

The following is immediate by using the tree representation of nested words and a 
straightforward translation of formulae. 

Claim 4.6. For every FO formula ip{x) over nested words there is an FO formula (p'{x) 
over trees such that for every nested word w and a position i in it, we have w \= ip{i) iff 

In fact the converse, that FO over trees can be translated into FO over nested words, 
is true too, but we do not need it in this proof. 

Since FO = TL*'^'''^ by Lemma 14.51 all that remains to prove is the following claim. 

Claim 4.7. For every TL*''^'' formula ip, there exists an NWTL* formula ip° such that for 
every nested word w and every position i in it, we have 

{w,i)\=ip° (T^, tw_t(i)) ^ 

This is now done by induction, omitting the obvious cases of propositional letters and 
Boolean connectives. We note that a path down the tree from iw-t(^) to tw-t(j) corresponds 
precisely to the strict path from i to j (that is, if such a strict path is z = io, ii, . . . , ifc = j, 
then iw-t{io), i'w-t{ii), ■ ■ ■ , i^w-t{ik) is the path from i„_t(i) to iw-t(j) in T^jj). Hence, the 
translations of until and since operators are: 

For translating next and previous operators, and pending calls/returns, define: 
mcall = 0/j,~r (true in a matched call position); 
mret = 0^T (true in a matched return position). 
Then the rest of the translation is as follows: 



pcall° 


= call A -imcall 


pret 


= ret A -imret 




= ^mretA (Ot^" V(callAOMO'^°)) 




= (Oret A 00^V9°) V (©-mret A 0^3°) 




= 0retA00^Ov^° 




= 0call A 0O^OV9° 



Now with the proof completed for finite nested words, we extend it to the case of nested co- 
words. Note that Claim HT6l continues to hold, and Claim provides a syntactic translation 
that applies to both finite and infinite nested words, and thus it suffices to prove an analog 
of Lemma 14.51 for trees of the form T^o, where w ranges over nested w-words. 

If -zD is a nested w-word, then T^, has exactly one infinite branch, which consists precisely 
of all nodes of the form tw-t(^) where i is an outer position, i.e., not inside any (matched) 
call. We say that i is inside a call if there exists a call j with a matching return k such that 
j < i < k. If i is an outer position, then we shall call tw-tii) an outer node in the tree T^, 
as well. 
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If i is an outer position which is not a matched call, then z + 1 is also an outer position 
and /-w-t(^ + 1) is the left successor of iw_t(i). If i is an outer position and a call with j > i 
being its matching return, then the left successor of tw-tii) on the infinite path is iw-tU + !)• 
Furthermore, the subtree t^{i), which has tw-t(0 ^ the root, plus its right child, and all 
the descendants of the right child, is finite and isomorphic to T^[ij] (note that w[i,j] has 
no pending calls/returns). If i is an outer position other than a matched call, we let f^i^i) 
be a single node tree labeled with i's label in id. 

Let id now be a nested cj-word. For each outer position i we let r^(i) be the rank-m 
type of t^{i). If i is not a matched call, such a type is completely described by i's label 
(which consists of a label in S and potentially pcall or pret). 

If j is not an outer position, and i is an outer position such that i < j < k, where k is 
the matching return of i, then r^(j) is the rank-m type of {Tu,[i,k], L^.tU)) (i-e., the type 
of Ttf,[i, k] with a distinguished node corresponding to j). 

Next, for a nested w-word w, let s be a node in T^, such that s = i-w-tii)- Let ii,i2, ■ ■ ■ 
enumerate all the outer positions of iv, and assume that ip is such that ip < i < ip+i - that 
is, iw-t{i) is a node in the subtree f"{ip). We now define a finite word s^{w, s) of length p—l 
such that its positions 1, ... ,p — 1 are labeled r^(ii), . . . , r^(ip_i), and an w-word s^{w, s) 
such that its positions 1,2,... are labeled by r^(ip+i), r^(ip+2), .... Next we show: 

Claim 4.8. Let w,w' he two nested co-words, and s = L^.tii),s' = iw-t{i') two nodes in 
and Tyji such that: 

(a) s.;;;(u;,s) =.m s;^(tD',s'); 

(b) s:;;^{w,s) =m s;;^ (-«)', s'); 

(c) Tg{i)=r^{i'). 
Then {Tis,s) =m (Tu,',s'). 

Proof. A standard composition argument shows that Player II wins. If ii, • • • enumerate 
outer positions in w and ip < i < ip+i, then a move by Player I, say, in T^,, occurs either 
in t'^{j) with j < i, or in t'^{i), or in t^{j) with j > i. Player II then selects j' so that the 
response is in (j') according to his winning strategy in games either (a) or (b) (if j is in 
t^{i), then j' is in {i')), and then, since the rank-m types of t^(j) and the chosen t^ {j') 
are the same, selects the actual response according to the winning strategy t^{j) =m t^ (j'). 

□ 

Next we show how Claim ITHl proves that FO is expressible in TL*'^'^° over infinite trees 
Tu,. First note that being an outer node is expressible: since O^T is true in right children 
of matched calls, then 

Pouter = ^(TS|(0<-T)) 

is true if no node on the path to the root is inside a call, that is, precisely in outer nodes. 

Next note that for each rank-m type r of a tree there is a TL*''°'^ formula Pr such that 
if s = i.w-t{i) is an outer node of T^, then (T^o, s) \= (3r iff the rank-m type of f^ii) is r. If 
i is not a matched call, then such a type is uniquely determined by i's label and perhaps 
pcall or pret, and thus is definable in TL*'^''". 

If i is a matched call, the existence of such a formula [ir follows from the fact that the 
rank-m type of t^{i) is completely determined by the label of i and the rank-m type r' of 
the subtree t^ii) of t'^{i) rooted at the right child of s (recall that the root only has a right 
child, by the definition of t^{i)). Type r' is expressible in FO and, since t"'(i) is finite, by 
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Lemma [43] it is expressible by a TL*'^*^'^ formula (3'^,. If we now inductively take conjunction 
of every subformula in with ^aouter, we obtain a formula such that (T^,, s • 1) |= 
iff to'(i) 1= (3", iff the rank-m type of (*) r'. Hence, (3r is expressible in TL*^'°'^ as a 
Boolean combination of propositional letters from S and formulas Oj,/?^/- Note that in this 
case, Pr does not use pcall and pret. 

By Claim|121 we need to express, for each node s = iw-t(*)i the rank-m types of sl^{w, s) 
and s^(tD,s) in TL*''*^'^ over T^,, as well as the rank-m type of T^{i), in order to express 
a quantifier-rank m formula, as it will be a Boolean combination of such formulas. Given 
s, we need to define iw-t(v) the outer position in whose scope s occurs - and then from 
that point evaluate two FO formulas, defining rank-m types of words over the alphabet of 
rank-m types of finite trees. By Kamp's theorem [13], each such FO formula is equivalent 
to an LTL formula whose propositional letters are rank-m types of trees. 

Assume we have an LTL formula 7 expressing the rank-m type tq of s^{w,s). By 
Kamp's theorem and the separation property for LTL, it is written using only propositional 
letters. Boolean connectives, O and U (that is, no and S). We now inductively take 
conjunction of each subformula of 7 with -i(O^T) (i.e., a TL^^^'^ formula which is true 
in left successors), replace LTL connectives O and U by Oj, and U^, and replace each 
propositional letter r by f3r, to obtain a TL*''*''^ formula 7'. Then {T^,, i^.t{ip)) \= 7' iff 
s^{w,s) has type tq. Thus, for a formula 

7" = {(^outer A 7') V -^aouterSliaouter A 7') 

is true in (T^o, iw-t(O) iff the rank-m type of s^{w, s) is tq. 

The proof for s^(t(),s) is similar. Since this word is finite, by Kamp's theorem and 
the separation property, there is an LTL formula 7 that uses 0, S, propositional letters 
and Boolean connectives such that 7 evaluated in the last position of the word expresses 
its rank-m type. Since there is exactly one path from each node to the root, to translate 7 
into a TL*'^'^^ formula 7' we just need to replace propositional letters by the corresponding 
formulas Pr, and by 0|. Then, as for the case of s^{w,s), we have that 7' evaluated 
in i^f,-t{^p) expresses the type of sl^{w,s). Then finally the same formula as in the case of 
s^{w,s) evaluated in s expresses that type. 

Finally we need a TL*''*^^ formula that expresses T'!^{i)i the rank-m type of t^{i), when 
evaluated in (T^u, (,w-t(^))- We can split this into two cases. If Oouter is true in i.w-t(^)> then, 
as explained earlier, the rank-m type of t'^{i) is a Boolean combination of propositional 
letters, and thus definable. 

So we now consider the case when Uouter is not true in i^.t{i). Then T^{i) is given by a 
Boolean combination of formulas that specify (1) the label of ip, and (2) the rank-m type 
of the subtree of t^{ip) rooted at the right child of iw-t{ip) with s as a distinguished node. 
This type can be expressed by a formula 7 in TL*''"'' over to(^p) hy [IH]- Hence if in 7 we 
recursively take the conjunction of each subformula with ^Oouter, we obtain a formula 7' 
of TL*'''^'^ that expresses the type of {tQ{ip),s) when evaluated in (Tu),s). Thus, T^{i) is 
expressible by a Boolean combination of formulas 7' and -^aouterSi{aouter A a) where a is a 
propositional letter. 

This completes the proof of translation of FO into TL*''"'^ over nested cj-words, and thus 
the proof of Lemma 14.41 and Theorem 14.11 □ 

Recall that FO'^ stands for a fragment of FO that consists of formulas which use at 
most k variables in total. First, from our translation from NWTL to FO we get: 
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Corollary 4.9. Over nested words, every FO formula with at most one free variable is 
equivalent to an FO^ formula. 

It is well known that LTL over w-words has the separation property, and in particular, 
every LTL formula is equivalent to an LTL formula without the past connectives when 
evaluated in the first position of an w-word. In the case of nested words, however, the 
situation is quite different from LTL. The following proposition shows that past connectives 
are necessary even when one evaluates formulae in the first position of a nested word. We 
let NWTL^"*"''"' be the future fragment of NWTL (i.e. the fragment that does not use S'^ 
and the operators and 0^). 

Proposition 4.10. There are FO sentences over nested words that cannot be expressed in 
NWTL^^*^''°. 

Proof. We shall look at finite nested words; the proof for the infinite case applies 
verbatim. To evaluate a formula (p of NWTL^^*"''° in position i of a nested word w of length 
n one only needs to look at w[i,n]. That is, if w and w' of length n and n' respectively 
are such that t()[i,n] = w[i',n'], then {w,i) \= ip iS {w',i') \= ip for every formula p of 
NWTL^"*"'■^ 

Furthermore, for every collection of NWTL^^*"'''^ formulas ^' = {i^i, . . . one can 
find a number k = k(^) such that 

n] =k w[i' , n'] implies (u), i) \= il^p ^ {w' , i') \= V'p, for all p < I. 

In particular, if stands for the word of length r in which all positions are labeled b and 
the matching relation is empty, there are numbers ki > /c2 depending only on ^, such that 

6^-1 ^iPp^ b^^ h V'p, for ah p<l. 

Now consider the following NWTL formula: 

a = A O^j0a, 

saying that the first position is a call, and the predecessor of its matching return is labeled 
a. We claim that this is not expressible in NWTL^''*"''°. 

Assume to the contrary that there is a formula ^ of NWTL^"*"''° equivalent to a. Let ^ 
be the collection of all subformulas of /3, including /? itself, and let k\ and k^ be constructed 
as above. We now consider two nested words w\ and W2 of length fci + 2 whose underlying 
words are bal}^^ of length n = /ci + 2, such that the matching relation ii\ of wi has one edge 
3), and the matching relation \i2 of has one edge [i^iX^ n + 1 — k'l). In other words, 
the only return position of w\ is r\ = 3, and the only return position of ?D2 is r2 = n + 1 — /C25 
and thus w\ \r\ , n] = b^^ and W2 [?^2 ) ''^] = ^'^^ • Further notice that for every i > 1 we have 
w\\i,n\ = W2[i,n]. 

Observe that {ivi, 1) ^ a and (u)2, 1) |= ^a. 

We now prove by induction on formulas in ^ that for each such formula 7 we have 
{wi, 1) ^ 7 iff {w2, 1) \= 7, thus proving that /3 and a cannot be equivalent. 

• The base case of propositional letters is immediate. 

• The Boolean combinations are straightforward too. 

• Let 7 = Oip. Then 

{wi, 1) N 7 
(i()i,2)^V 
^ {W2,2)^i; 

^ {W2,l)hl, 
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since it;i[2,?i] =W2[2,n]. 

• Let 7 = Ofi'4>- Then 

{wi, 1) ^ 7 

<;=^ {w2,n + I- k2) \=ij 
^ (u)2,l)N7, 

since tp ^. 

• Let 7 = ifU'^Tp. Assume (tDi, 1) |= 7. Consider three cases. 

Case 1: {wi, 1) \= ip. By the hypothesis {w2, 1) \= ip and we are done. 
Case 2: The witness for (pV^ip occurs beyond the only return. Then (wi, 1) \= ip 
and (?Di,ri) \= (plJ^ilj. Since (pJJ^ip G we have (1^25 ''2) N and by the 

hypothesis, {w2, 1) ^ V'j so {w2-, 1) |= ipTJ'^ip. 

Case 3: The witness for ip\J"tp occurs inside the call. Since for every position i > 1 we 
have {wi,i) \= (p \^ {w2,i) \= ^ and likewise for tp, the same summary path witnesses 
ipU^ip in W2- 
Thus, {w2, 1) 1= 7. 

Now assume (u)2, 1) |= 7. In the proof of {wi, 1) |= 7 is the same as above in Cases 1 
and 2. For Case 3, assume that in the path which is a witness for (p'XJ'^ip the position in 
which 'ip is true is the 2nd or the 3rd position in the word. Then the same path witnesses 
(iDi, 1) 1= 7, as in the proof of Case 3 above. Next assume it is a position with index j 
higher than 3 (which is still labeled h) where tp first occurs. Then 93 must be true in all 
positions i with 3 < i < j in u'2. Hence ip is true in all such positions in wi as well, and 
thus the summary path in wi that skips the first call (i.e. jumps from 1 to 3) witnesses 
ip)\J"tp. Hence, in all the cases {w2-,l) \= 7 implies {wi,l) \= 7, which completes the 
inductive proof, and thus shows the inexpressibility of a in NWTL^'^*"^*^. □ 

Note also that adding all other until/since pairs to NWTL does not change its expressive- 
ness. That is, if we let NWTL+ be NWTL + {U, S, U", S"}, then: 

Corollary 4.11. NWTL+ = FO. 

Later, when we provide our upper bounds for model-checking, we shall pride the upper 
bounds with respect to NWTL+ rather than just NWTL. 

Remark In the conference version, we had a corollary stating that the since operator 
can be eliminated for formulae evaluated in the first position of a nested word. It relied 
on the proof of Theorem 14.11 and the separation property for TL*'^'''^ claimed in [TU]. The 
latter, as was discovered recently, is incorrect. The proof of Theorem 14.11 relies only on the 
expressive completeness of TL*'^'^° which is correct |26t [20] and thus is not affected. 

4.2. The within operator. We now go back to the three until/since operators originally 
proposed for temporal logics on nested words, based on the the linear, call, and abstract 
paths. In other words, our basic logic, denoted by LTL'^, is 

(p,ip' := T I a I call | ret | -k^ | ipM ip' \ 
0(p j OfiV I Of I ©fiV I 

ipVip' I ipSip' I ipVip' I vjSV' I V?U"(/?' I ipS^'ip' 
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We now extend this logic with the within W operator proposed in [2]. Recall that 
{w,i) \= Wip iff i is a call, and {w[i,j],i) \= ip, where j = r[i) if i is a matched call, j = \w\ 
if i is an unmatched call and w is finite, and j = oo otherwise. We denote this extended 
logic by LTL'' + W. 

Theorem 4.12. LTL^ + W = FO over both finite and infinite nested words. 

Proof. The translation from LTL^ + W into FO is similar to the translation used in the 
proof of Theorem 14.131 To prove the other direction, we show how to translate NWTL'* 
into LTL^ + W. Recall that by Lemma we know that NWTL^ = FO over both finite 
and infinite nested words. More precisely, for every formula 93 in NWTL*, we show how to 
construct a formula a^p in LTL^ + W such that for every nested word w (finite or infinite) 
and position i in it, we have that {w, i) \= ip ii and only if [w, i) \= a^p. 

Since LTL^ includes the same past modalities as NWTL'^, a^p is trivial to define for the 
atomic formulas. Boolean combinations and next and previous modalities: 





:= T, 


"call 


:= call, 


"ret 


:= ret, 


"a 


:= a, 




■ = -^a^p, 




:= a^p V 




:= Oaip, 




■ — C)^cy,ipi 




:= 0a^, 




. — ©)^Otip. 



Thus, we only need to show how to define a<^uj^/) a-^d a^p^a^. Formula a<^uji/) is defined as: 



mret A 



V 



mret A ( ( /3(^U"(-imret A 



V 



/3^U'^(/3^ AO(mret Aa^,)) V /3^U"WO(av, A 0(78^(0^ A -0T))) 



where mret is defined as ret A 0^T, to capture matching return positions, is defined 
as T\J6 and formulas /3<^, 7 are defined as: 

(3^p := OipVmret, 

7 := /3<^S"(a^ A ^ret A 0((a^ A ^ret)Scall)). 

Moreover, formula a^s%'^ is defined as: 

a^si^p ■■= a^, V (a^ A Q{-fS''{f3^S''{a^ A ^mret)))). 

This concludes the proof of the theorem. □ 
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4.3. CaRet and other within operators. The logic CaRet, as defined in [2], did not 

have all the operators of LTL^. In fact it did not have the previous operators and 0^, 
and it only had linear and abstract until operators, and the call since operator. That is, 
CaRet was defined as 

ip^^p' := T I a I call | ret | -k/? | ipM ip' \ 

Oif I 0^99 [ Qcf ! 

ipXJif' I ifVip' I ifS^ip' , 

and we assume that a ranges over Sujpret}, where pret is true in pending returns. Notice 
that pret is not expressible with the remaining operators. Recall that the operator 0c is 
the previous operator corresponding to call paths; formally, {w,i) \= Qc^ iff^ C{i) is defined 
and {w, C{i)) \= if. 

A natural question is whether there is an expressively-complete extension of this logic. 
It turns out that the past modality 0, together with two within operators based on C and 
TZ (the innermost call and its return) functions provide such an extension. We define two 
new formulas dp and TZip with the semantics as follows: 

• (tD, i) \= dp iff i], j) \= 97, where j = C{i) if C{i) is defined, and j = 1 otherwise. 

• {w,i) \= TZip if {w[i,j],i) \= ip, where j = TZ{i) if 7?.(i) is defined, and j = \w\ (if w is 
finite) or 00 (if w is infinite) otherwise. 

The logic obtained by adding C and TZ to CaRet is denoted by CaRet + {C^TZ}. 

Theorem 4.13. CaRet + {C^TZ} = FO over both finite and infinite nested words. 

As a corollary (to the proof) we obtain the following: 

Corollary 4.14. For every FO formula ip{x) over finite or infinite nested words, there is 
a formula of CaRet + {C, TZ} that does not use the operator, such that w \= (p{i) iff 

The proof of this result is somewhat involved, and relies on different techniques. The 
operators used in CaRet do not correspond naturally to tree translations of nested words, 
and the lack of all until/since pairs makes a translation from NWTL hard. We thus use 
a composition argument directly on nested words. The theorem is proved for finite nested 
words, but the same techniques can be used to prove the infinite case. 

We extend the vocabulary with two constants min and max, and assume that min is 
always interpreted as the first element of the nested word and max as the last element. 

Let u; be a finite nested word of length n and and i an element in H). Let ci, . . . , Cm, 
where m > 0, be all elements in w such that, for each j € Cj < i and there is 

an element rj such that fi{cj,rj) and i < rj. Assume without loss of generality that 

Fix k > 0. Let F be the set of all rank-A; types of nested words with distinguished 
constants min and max (including the rank-A: type of the empty nested word). We define 
the word ^kiw, i) = a^ai ■ ■ ■ Um over alphabet F x F as follows: 

• The element oq is labeled with the tuple whose first component is the rank-A; type 
of (?D[1,ci — 1], min, max) and whose second component is the rank-A: type of {w[ri + 
1, n], min, max) if m 7^ (notice that if ci = 1 then w[l, ci — 1] is the empty nested word, 
and the same is true of w[ri+i,n] if = n); otherwise, it is labeled with the tuple whose 
first component is the rank-A; type of {w[l, i — 1], min, max) and whose second component 
is the rank-A; type of {w[i, n], min, max) (notice that if i = 1 then w[l, i — 1] is the empty 
nested word); 
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• for each < j < m, the element aj is labeled with the tuple whose first component is 
the rank-Zc type of (tD[cj,Cj+i — l],min, max) and whose second component is the rank-A; 
type of (tD[r + 1, r^], min, max); and 

• if m 7^ then the element is labeled with the the tuple whose first component is the 
rank-fc type of {w[cm,i — l],min, max) and whose second component is the rank-/c type 
of r^], min, max). 

The following is our composition argument: 

Lemma 4.15 (Composition Method). Let iui and W2 he two nested words, and let i and 
i' he two elements in wi and W2, respectively. Then nk{wi,i) =k+2 ^kiw2,i') implies 
(?Di, i, min, max) =k (i()2, i', min, max). 

Proof. First we need to introduce some terminology. Let w be a finite nested word of length 
n and z be a position in w. Assume elements ci, . . . ,Cm,ri, . . . ,rm are defined as above. 
With each element s of tt) we associate an element [s] of i^kiw, i) as follows: 

• If m 7^ and s belongs to iZ)[l,ci — 1] or w[ri + l,n], then [s] is the first element of 
Q,k{w,i). In such case we say that u)[0,ci — 1] and il)[ri + are the left and right 
intervals represented by [s], respectively. 

If m = and s is an arbitrary element of u), then [s\ is also the first (and unique) 
element of Q.k{w., i). In such case we say that •i()[0, i — 1] and u)[i, n] are the left and right 
intervals represented by [s], respectively. 

• If ?Ti 7^ and s belongs to w[cm,i — ^ or ?D[i,rm], then [s] is the last element of In 
such case we say that w[cm, ^ — 1] and w[i, r^] are the left and right intervals represented 
by [s], respectively. 

• If m 7^ and s belongs to ?D[c£,q+i — 1] or w[ri^i + l,ri], for some 1 < i < m, then 
[s] is the {i + l)-th element of Qk{w,i). In such case we say that ?D[c£,q+i — 1] and 
w[ri-^-l + l,ri] are the left and right intervals represented by [s], respectively. 

We denote by [s]^ and [s]^ the left and right intervals represented by [s], respectively. 

We now prove the lemma. For each round j (0 < j < k) of the A:-round game on 
(?Di, i, min, max) and (t()2, min, max), Player II's response bj in W2 to an element aj 
in Wi, played by Player I is defined as follows (the strategy for the case when Player I 
picks a point in ?D2 is completely symmetric). Assume that Player I plays element [aj] in 
^kiwiji) in round j of the (A; + 2)-round game on J7fc(u)i,i) and r2fc(tt)2, i')- Then given 
that ^lk{wi, i) =k+2 ^k{uj2, i'), Player II uses her winning strategy to choose a response [qj] 
in r2fc(?D2)^') to [aj]. Thus, by definition of O^, we have that the right and left intervals 
represented by [uj] have the same rank-fc type as the right and left intervals represented 
by [qj], respectively. Hence, if Oj belongs to the left interval represented by [oj], then the 
Player II can find response bj to Oj according to the winning strategy for the fc-round game 
on [oj]^ and [qj]^, and if Oj belongs to the right interval represented by [oj], then the Player 
II can find response bj to Oj according to the winning strategy for the fc-round game on 
[oj]^ and [qj]^. 

Assume that for round < j < k the elements played by following this strategy are (1) 
{[pi],... ,[pj]) in Qk{wi,i), (2) {[qi], . . . ,[qj]) in Qk{w2,i'), (3) {ai,...,aj) in wi, and (4) 
{hi, ... ,bj) in W2. We note that by definition of the strategy, for every i € [1, j], we have 
that Oi = Pi or bi = qi- Since we assume that the [pjYs and [qjYs are played according to a 
winning strategy for Player II in the {k + 2)-round game on r2/j(t()i,i) and r2fc(?D2,i'), it is 
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the case that: 

{nk{wi,i),[pi],...,\pj]) =k-j+2 

{Qk{w2,i'),[qi],...,[Qj])- 
By the way the strategy is defined, for each I € if af and are the subtuples 

of (fli, . . . , flj) containing the elements from (ai,...,aj) that belong to [a^]'^ and [ai]^, 
respectively, then the corresponding subtuples bf and bf of {bi,...,bj) contain the ele- 
ments from {bi,...,bj) that belong to [be]^ and [be]^, respectively. Further, by defini- 
tion of the strategy, we also have that ([a^]''", a|', min, max) =k-j {[bi]^ ,bf ,m.m, max) and 
([a^]^, a|^, min, max) =k-j ([^^J'^, min, max). 

We now show how to define Player II's response in the round j ' + 1. Let us as- 
sume without loss of generality that for round j + 1 of the game on (iDi, min, max) and 
{w2,i' , min, max), Player I picks an element Oj+i in tDi that belongs to the left interval rep- 
resented by [aj-|_i] (all the other cases can be treated in a similar way). Player II response 
bj^i in iB2 is defined as follows. First, there must be an element [s] in Qi.{w2,i') such that 

{nk{wi,i), bi], . . . , [pj], [Pj+i]) =k-j+i 

(Ofc(u)2,z'), [gi], • • • , [qj], [s]), 

where Pj+i = dj+i- The latter, together with the way that the strategy is defined, 
implies that there is an element b in [s]^ such that ([oj+i]^, a', a^+i, min, max) =fc_j_i 
([■s]'^, 5', b, min, max), where a' is the subtuple of (ai, . . . , aj) containing all the elements in 
(ai, . . . , aj) that belong to [oj+i]^ and b' is the corresponding subtuple of . . . , bj). We 
then set bj^i = b. 

We show by induction that, for each j < k, if {ai, . . . ,aj) and (bi, . . . ,bj) are the 
first j moves played by Player I and Player II on i, min, max) and (t()2, i', min, max), 
respectively, according to the strategy defined above, then {{ai, . . . , aj) , (bi, . . . , bj)) defines 
a partial isomorphism between (it)i, i, min, max) and (i()2, i', min, max). This is sufficient to 
show that i, min, max) =k (if)2, min, max). 

Assume j = 0. Since Qki^i^i) =fc+2 ^ki'W2,i'), it must be th case that the labels of 
the last elements of Qk{wi,i) and 0,k{w2,i') coincide. Thus, ([z]^,i) =o {[i']^,i'), and we 
conclude that i and i' have the same label, and i is a call (resp. return) iff i' is a call (resp. 
return). Further, if i = min then Qi.{wi,i) has only one element and that element is labeled 
{ts,t), for some r ^ r^. Since Qkiwi,i) =k+2 ^kiw2,'i''), ^k{w2,'i') also has a single element 
and that element is labeled {ts,t). It follows that i' = min. The converse can be proved 
analogously. In the same way it is possible to show that i = max iff i' = max. 

Assume that the property holds for j. Also, assume without loss of generality that for 
the round j + 1 of the game on (iDi, i, min, max) and (i()2, min, max). Player I picks an 
element Cj+i in Hii that belongs to the right interval represented by [fflj+i] (all the other 
cases can be treated in a similar way). We prove that bj^i as defined above preserves the 
partial isomorphism. First we show that Oj+i = i iff = i' . In this case [oj+i] is the last 
element of ilkiwiji), and J7/c(tDi,i) =k+2 ^k{w2-,i') implies that is the last element of 

^k{w2,i')- Since aj+i = ^ is the first element of [oj+i]^, has to be the first element of 
which is i' . 

In the same way it is possible to prove that aj+i = min iff = min, and that 
Oj+i = max iff = max. 
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Further, it is also clear that the label of fflj+i in wi is a iff the label of bj-^-i in W2 is a, 
for each a € S. Next we consider the remaining cases. 

• Oj+i € call. Then ([aj+i]-^, a', aj+i, min, max) =fc_j_i 5', min, max), where 
a' is the subtuple of (ai, . . . , aj) containing all the elements in (ai, . . . , aj) that belong 
to [aj+i]^ and b' is the corresponding subtuple of (61, . . . ,bj). This immediately implies 
that 6j+i € call. The converse is proved analogously. 

• Oj-i-i € ret. This is similar to the previous case. 

• Suppose first that aj+i < holds for some i € Since Oj+i belongs to [oj+i]^, we 
have that ai belongs to [a^]^ and, thus, we only need to consider the cases [a^] = [a^+i] 
and [oj+i] < [ai]. If [ai] = [aj+i], then ([a^]^, a^, a^+i) =0 {[bi]^,be,bj+i) and, therefore, 
bj+i < bi also holds. If [aj+i] < [a^], then [bj+i] < [bi] and, thus, bj^i < bg holds since bi 
and belong to [be]^ and [5j_|_i]'^, respectively. 

Suppose, on the other hand, that a£ < aj+i holds for some £ G We need to 

consider three cases: [ai] = [aj+i], [ai] < [oj+i] and [aj+i] < [ai]. If [ai] = [cj+i], then 
([a^]^, a£, Oj+i) =0 ([^^J'^, 6^, 6j+i) and, therefore, be < also holds. If [aj+i] > [a^], 
then a£ belongs to [ai]^ and [^j+i] < [6^] and, thus, b£ < bj^i holds since be belongs to 
[be]^ while belongs to Finally, if [a^] > [a^+i], then [be] > and, thus, 

be < bj-^i holds since bj+i belongs to and every element in [fej+i]^ is bigger than 

every element in either [be]^ or [be]^. 

The converse is proved analogously. 

• Suppose first that //(oj+i, a^) holds for some £ € Since a^+i belongs to the right 
interval represented by [cj+i], it is the case that [ae] also belongs to [oj+i]-^. Thus, given 
that ([a^]^, a^, Oj+i) =0 {[be]^,be,bj^i), we conclude that ^{bj+i,be) holds. 

Second, n{ae,aj+i) holds for some £ G It is not hard to see that [ae] = 

[oj+i]. We need to consider two cases: If ae belongs to [oj+i]'^, then {[ae]^,ae,aj^i) =0 
be, 6j+i), and thus, fi{be, &j+i) holds. If ae belongs to [aj_|_i]^, then ae is the first el- 
ement of [oj+i]''" and Oj+i is the last element of [aj+i]^. Thus, since ([aj+i]''", a^, min) =0 
min), we conclude that be is the first element of Further, since 

([aj+i]^, aj_|_i, max) =0 ([ftj+i]^, ^j+i, max), we conclude that is the last element 
of Therefore, fi{be,bj^i) holds. 

The converse is proved analogously. 
This concludes the proof of the lemma. □ 
We now present the proof of Theorem 14.131 

Proof of Theorem \4-13[ We first show that every CaRet + {TZ,C} formula 93 is equivalent 
to an FO formula a,p{x) over nested words, that is, for every nested word w it is the case 
that {w,i) \= if iE w \= a^{i). The translation is standard, and can be done by recursively 
defining a;^(x) from tf as shown below. We use the notation 9[x)^y'^^ for the relativization 
of 9{x) to elements in the interval [y, z], that is, 0(rE)[^'^l is obtained from 9{x) by replacing 
each subformula of the form 3m/? with 3u{y <uAu<zA(3) and each subformula of the 
form Vu/3 with Vu(?/ <uAu<z^P). Here is the translation: 
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acaii(a;) := call(a;), 
aret(a;) := ret(a;), 
Q!int(a;) ■— -'call(a;) A ^ret(x), 
Q:pret(a;) := ret(x) A -•3y/^(?/, x), 
-.a^(x), 
a^{x) V a^,{x), 

3y{x<yA -^3z{x < z A z < y) A a^{y)), 
3y{y<xA -^3z{y < z A z < x) A a^{y)), 

^y{Kx,y) A a^{y)), 

3y3z (y < X Ax < z A ^i{y, z) A Q.ip(y) A 
VmVw(u < X a X < V a v) u ^ y\/ u < y)), 

3y ((.T < y V X = y) A a^4,{y) A 

'iz[z <yA{z = x\/x<z) a^{z))), 
3y {{x < yV X ~ y) A a^,{y) A \/u\fv {u<yAy<vA v) u < x) A 



a^:p{x) 
a,fiv^{x) 

aQ^ix) 



Vz(z <yA{z^x\/x<z)A VitVu {u<zAz<vA /i(u, v) u < x) — > a^{z))), 

a^S'4,ix) := a^{x) V 3y {y < x A acaii(j/) A \fz{^i{y,z) x < z) A a^,{y) A 

Vz (((z = x) V (acaii{z) A z < x A y < z A\fu {p{z, u) x < u))) a^{z))), 
ctCip{x) := {-'3y3z {^{y, z) A y < x A x < z) A^z {^3u{u < z) ^ a^(z)[^'^l ) V 
{3y3z z) Ay<xAx<zA 

VmVw(u < a; a a; < u a /i(it, v) — > u = yW u < y) A a<p(y)'^'^')), 
aiztfiix) := {-^3y3z (/x(y, z) A y < x A a; < z) A Vz {^3u{z < u) ^ a^{x)^'^'''^ ) V 
(3y3z {n{y, z) Ay<xAx<zA 

VmV?;(u < a; a a; < u a fJ,{u,v) u^yW u <y) A (x)'"^'^')). 



We now show the other direction, that is, FO C CaRet + {TZ,C}. We start by proving 
the result for FO sentences (that is, we prove that for every FO sentence (/? there is an 
CaRet + {TZ, C} formula il), such that |= 99 iff (w, 1) |= V')) a-^id then extend it to the case 
of FO formulas with one free variable. Let 99 be an FO sentence. We use induction on the 
quantifier rank to prove that 93 is equivalent to an CaRet + {1Z,C} formula. 

For k = Q the property trivially holds, as is a Boolean combination of formulas of the 
form Pa(min), Pa(max), min < max, /i(min, max), etc. All of them can be easily expressed 
in CaRet + {7^,C}. 

We now prove for k-\- \ assuming that the property holds for k. Since every FO sentence 
of quantifier rank k-\-l is a Boolean combination of FO sentences of the form 3xip{x), where 
(^(x) is a formula of quantifier rank /c, we just have to show how to express in CaRet + {7?,, C} 
a sentence of this form. 

Let r be the set of all rank-/c types of nested words over alphabet S U {min, max}. We 
distinguish by the rank-A: type of the empty nested word. By induction hypothesis, for 
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each r € r there is an CaRet + {TZ,C} formula such that {w, 1) |= iff the rank-Zc type 
of {w, min, max) is r. 

Let A be the set of all rank-(fc + 2) types of words over alphabet F x F. We first 
construct, for each A G A, an CaRet + {TZ,C} formula ax over alphabet S such that, 



for each nested word w and position i of id. 

Fix A E A. From Kamp's theorem |14j . there is an LTL formula Px over alphabet F x F 
such that a word u satisfies (3x evaluated on its last element iff the rank-(fe + 2) type of u is 
A. By the separation property of LTL, we can assume that Px only mentions past modalities 
and S. Moreover, given that 938^ = ^ V A ©(99 S ^)), we can also assume that Px is 
a Boolean combination of formulas of the form either 9 or QO' , where 9 does not mention 
any temporal modality and 9' is an arbitrary past LTL formula. Thus, since CaRet + {TZ,C} 
is closed under Boolean combinations, to show how to define ax from /3a, we only need to 
consider two cases: (1) Px is an LTL formula over F x F without temporal modalities, and 
(2) Px is of the form 00, where 9 is an arbitrary past LTL formula over F x F. Next we 
consider these two cases. 

• Assume that Px is an LTL formula without temporal modalities. Then is defined to 
be P'^, where ( )° is defined recursively as follows. Given (r, r') G F x F, (r, t')° is defined 
as follows, where we assume that is the rank-A: type of any nested word with a single 
element labeled a (a € S): 

(1) If r, r' 7^ Tg, then (r, t')° is defined as the disjunction of the following formulas: 



(a) (^ret A 0cT A 0(^call A C^r) A U^r'); 

(b) V{a|r=r4(^ret A 0cT A 0(call A a) A TZ^r'); 

(c) (^ret A ^0cT A QC^r A TZ^r'); 

(d) (pret A 0Cer A 7^^^/); 

(e) V{a|r'=r4(ret A -pret A a A 0(-call A C^r)); 

(f) V{(a,b)|r=r,,r'=r,}(ret A -pret A b A 0(call A a)). 

(2) if t' = Te then (r, r')° is simply -iT; and 

(3) if r = Te and t' ^ Te, then (r, r')° is defined as -i0T A TIS,t'- 
Furthermore, if ip and (p are LTL formulas without temporal modalities, then 



• Assume that Px is a formula of the form Q9, where 9 is an arbitrary past LTL formula. 
Then ax is defined to be where ( )* is defined recursively as follows. Given {t,t') G 
F X F, {t,t'Y is defined as follows: 

(1) If r, r' 7^ Te, then (r, r')* is defined as the disjunction of the following formulas: 



(a) 0((^call V (call A -O^T)) A C^r) A 0^0((^ret V pret) A Uir'Y 

(b) V{a|r=r4 0(call A 0;.T A a) A 0^0(-ret A U^r'); 

(c) V{(a,6)|r=r„,r'=r,} 0(call A Q^T A a) A 0^0(ret A 6); 

(d) V{a|r'=r4 0(-call A C^r) A 0^0(ret A -pret A a). 

(2) if r = Tg and t' ^ Te, then (r, r')* is defined as — 0T A Q) ^Q)'TI£,t' ] 

(3) if r 7^ Te and r' = Te, then (r, r')* is defined as 0C^r A -lOpOT; and 

(4) if r, r' = t^ then (r, t')* is defined as -0T A -O^QT. 




the rank-(A; + 2) type of is A 
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Furthermore, if '0 and are past LTL formulas, then 

(Q^y := Oc'P^ 

Now, let 3x(p{x) be an FO sentence such that the quantifier rank of (p{x) is k. Then, 
from our composition method (p{x) can be expressed in CaRet + {TZ,C} as the formula 
VagA' '^A) where A' C A is the set of all rank-(/c + 2) types of words over alphabet F x F 
that belong to {Qki^ji) \ w \= ^{i)}- Thus, 3xip{x) can be expressed as the following 
CaRet + {TZ,C} formula: TU (VasA' "^a)- This concludes the proof of the theorem. 

Finally, from the composition method and the previous proof we see that the equivalence 
FO = CaRet + {TZ,C} also holds for unary queries over nested words. □ 



5. Model-Checking and Satisfiability 

In this section we show that both satisfiability and model-checking are decidable in single- 
exponential-time for NWTL, and in polynomial time in the size of the model. Here we 
assume the model of the procedural program is given as a Recursive State Machine (RSM) 
[T]. (Runs of an RSM can naturally be viewed as nested words when matching func- 
tion calls (or "box entries") and returns (or "box exits") along the run are paired to- 
gether.) In fact we prove this bound for NWTL"^, an FO-complete extension of NWTL 
with all of U, S, U'^, S'^, U", S". We use automata-theoretic techniques: translating for- 
mulae into equivalent automata on nested words. We then show that the logic based on 
adding the within operator to NWTL"*", (and even just adding within to CaRet) requires 
doubly-exponential time for model-checking, but is exponentially more succinct. 

5.1. Nested v^ord automata. A nondeterministic Biichi nested word automaton (BNWA) 
A over an alphabet S is a structure (Q, Qq, Qf,P, Pq, Pf, 6c, Si,6r) consisting of a finite set 
of states Q, a set of initial states Qo ^ Q, a set of Biichi accepting states Q j ^ Q, a 
set of hierarchical symbols P, a set of initial hierarchical symbols Pq P, a set of final 
hierarchical symbols Pf Q P, a, call-transition relation (5c C Q x S x Q x P, an internal- 
transition relation 6i Q Q x T, x Q, and a return-transition relation 5rQQxPxT,xQ. 
The automaton A starts in an initial state and reads the nested word from left to right. 
The state is propagated along the linear edges as in the case of a standard word automaton. 
However, at a call, the nested word automaton propagates state along the linear edge as 
well as a hierarchical symbol along the nesting edge (if there is no matching return, then 
the latter is required to be in Pf for acceptance). At a matched return, the new state 
is determined based on the state propagated along the linear edge as well as the symbol 
along the incoming nesting edge (edges incident upon unmatched returns are assumed to 
be labeled with initial hierarchical symbols). 

Formally, a run r of the automaton A over a nested word w = (0102 • • • , call, ret) 
is a sequence qo,qi,... of states along the linear edges, and a sequence pi, for every call 
position i, of hierarchical symbols along nesting edges, such that qq G Qq and for each 
position i, if i is a call then (qi^i, Ui, qi,pi) € Sc, if i is internal, then {qi-i,ai,qi) G 5^; if i 
is a return such that //(j, i), then {qi-i,pj, ai,qi) € 6r] and if i is an unmatched return then 
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{qi-i,p,ai,qi) S 6r for some p € Pq- The run r is accepting if (1) for all pending calls i, 
Pi E Pf, and (2) the final state E if ?D is a finite word of length i, and for infinitely 
many positions i, E Q/, if is a nested w-word. The automaton A accepts the nested 
word w if it has an accepting run over w. 

Nested word automata have the same expressiveness as the monadic second order 
logic over nested words, and the language emptiness problem for them can be decided 
in polynomial-time [3]. 

5.2. Tableau construction. We now show how to build a BNWA accepting the satisfying 
models of a formula of NWTL"*". This leads to decision procedures for satisfiability and 
model checking. 

Given a formula if, we wish to construct a Biichi nested word automaton A^p whose 
states correspond to sets of subformulas of ip. Intuitively, given a nested word w, a run 
r, which is a linear sequence go9i • • • of states and symbols pi labeling nesting edges from 
call positions, should be such that each state qi is precisely the set of formulas that hold 
at position i + 1. The label pi is used to remember abstract-next formulas that hold at 
position i and the abstract-previous formulas that hold at matching return. For clarity 
of presentation, we first focus on formulas with next operators O and O/i, and until over 
summary-down paths. 

Given a formula ip, the closure of (p, denoted by cl{ip), is the smallest set that satisfies 
the following properties: 

• cl{f) contains </?, call, ret, int, and O^'et; 

• if either -i^, or OV' or O^V' is in cl{ip) then ifj E cl{ip); 

• if TpV 7p' G cl{ip), then ip, tp' E cl{ip); 

• if VU'^^V' G cl{(p), then V, ip' , 0(V' U'^^V'Oi and O^ii^V^^ip') are in cl{ip); and 

• if '(/' E cl{(p) and ^ is not of the form ^9 (for any 9), then —iip E cl{(p). 

It is straightforward to see that the size of cl{(p) is only linear in the size of (p. Henceforth, 
we identify -i-i^ with the formula tp. 

An atom of is a set <I> C cl{ip) that satisfies the following properties: 

• For every ip E cl{(p), tp £ ^ iS —^ip $ . 

• For every formula tp y ip' € cl{(p), ■0 V V'' E <^ iff (V' E $ or -;/;' E 

• For every formula tpU'^^ip' E cl{(p), ipJJ^^ip' E $ iff either ip' & ^ or {ip G ^ and 
Oret $ and OiipV^ip') E $) or E ^ and 0^{ipV^ip') E 

• <I> contains exactly one of the elements in the set {call, ret, int}. 

• If O/iV' G ^ for some ip, then call E ^. 

These clauses capture local consistency requirements. In particular, a summary-down until 
formula ip JJ'^^ip' holds at a position if either the second argument ip' holds now, or ip holds 
now and satisfaction of ipU'^^ip' is propagated along a call, internal, or nesting edge. 

A hierarchical-atom of 99 is a set $ C cl{(p) such that if ip ^ then O/^ip E cl{(p). 
A hierarchical-atom contains possible abstract-next obligations to be propagated across 
nesting edges. 

Given a formula ip, we build a nested word automaton yl^ as follows. The alphabet S 
is 2^^ , where AP is the set of atomic propositions. 

(1) Atoms of ip are states of A^; 

(2) An atom $ is an initial state iff (/? E <I>; 

(3) Hierarchical-atoms of ip are hierarchical symbols of A^; 
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(4) All hierarchical symbols are initial; 

(5) For atoms and a symbol a C AP, ($, a, is an internal transition of A^p iff (a) 
int G $; and (b) for p G AP, p € a iff p G <I>; and (c) for each G cl{(p), V' G ^ iff 

(6) For atoms a hierarchical-atom ^f^, and a symbol a C ^P, ($,a,^i,^/j) is a call 
transition of iff (a) call G and (b) for p G AP, p G a iff p G and (c) for each 
OV' e c/(99), G 'I'i iff Oip G and (d) for each O^^ G c/(99), V G iff O/^^A € 

(7) For atoms 'f, hierarchical-atom ^/j, and a symbol a C AP, (^i, o, ^) is a retm'n 
transition of A^p iff (a) ret G and (b) for p G v4P, p G a iff p G and (c) for each 
Oil^ G cl{ip), ip e'^ iS Oip G and (d) for each O^^ G c/(93), V G ^fc iff ^ G 

The transition relation ensures that the cm^rent symbol is consistent with the atomic propo- 
sitions in the current state, and next operators requirements are correctly propagated. 

The sole final hierarchical symbol is the empty hierarchical-atom. This ensures that, 
in an accepting run, at a pending call, no requirements are propagated along the nesting 
edge. For each until-formula ip in the closure, let be the set of atoms that either do not 
contain ip or contain the second argument of ip. Then a nested word w over the alphabet 
2"^^ satisfies if iff there is a run r of A^ over u) such that all pending call edges are labeled 
with the sole final hierarchical symbol, and for each until-formula ip G cl{(p), for infinitely 
many positions i, qi € F^. This multi-Biichi accepting condition can be translated to Biichi 
acceptance as usual by adding a counter. 

Now we proceed to show how to handle various forms of until operators. In each 
case, we specify the changes needed to the definition of the closure and local consistency 
requirements for atoms. 

Global paths: If ipTJip' G cl{(p), then ip, ip' , 0{'ip\J'ip') are in cl{ip). Local consistency of 
requires that for every formula ipJJip' G cl{ip), ipJJip' G $ iff either ip' ^ or {ip ^ 
and OiipJJip') G 

Summary-up paths: If ipV'^ip' G d{ip), then ip, ip' , 0{ipVH'), and O^Ji{'^P^"H') 
are in cl(tp). Local consistency of requires that for every formula ipU'^^ip' G cl{ip), 
ipV^^ip' G <5 iff either ip' G or (V' G $ and call G <I> and 0/j(^U'^V) G or 
(V' G $ and call 4> and Qi^pV^ip') G 

Abstract paths: If ipVip' G d{ip), then ip, ip' , 0(V'U"V')> O0/.T, and OM^""^') 
are in cl{(p). Local consistency of <1> requires that for every formula ip\J"'ip' G cl{(p), 
ip\J"-ip' G <I> iff either ip' G or (V' G $ and call G $ and 0^{iplJ"-ip') G <^), or 
{ip e ^ and call $ and Qret ^ ^ and 0{iplJ°-ip') G or G <^ and Qret G $ 
and O0^T ^ <1> and C){ip\J'^'ip') G <!*). The last case accounts for propagation of the 
eventuality across unmatched returns. 

Call paths: Recall that positions along a call path are related by the innermost call 
operator: a call path jumps from a call position i to a position j such that i = C{j). Thus, 
a call path can be simulated by a summary-down path consisting of call edges, summary 
edges and internal edges, where the formula is asserted only before following the call 
edge. This effect is captured by using an auxiliary operator as follows. If ipU^ip' G cl{ip), 
then Ip, Ip', ipV^ip', 0{ip\J'^ Ip'), and Of^{ip\J'^ ip') are in d{(p). Local consistency of 
$ requires that for every formula ipJJ'^ip' G d{ip), ipJJ^ip' G <1* iff either G or 
{ip e ^ and call G ^ and 0{iplJ'^' ip') G $); and ipV^'ip' G <I> iff either ipV^ip' G or 
0^{ipW'ip') G or (Oret $ and O(V'U'=>0 G <^>). 
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Summary paths: The summary-until is handled using the fact that ^pXJ'^ip' is equivalent 
to '(/'U'^T^^u'^i^'). 

Note that the definition of A^p stays unchanged, as the correct propagation of requirements 
is handled by next and abstract-next formulas ensured by local consistency. The eventual 
satisfaction of until formulas is handled the same way as before: for each until-formula ip in 
the closure, let be the set of atoms that either do not contain tp or contain the second 
argument of ip, and it is required that each such is visited infinitely often. 

The past-time formulas (previous, abstract-previous, and various forms of since opera- 
tors) are handled in a symmetric manner. Thus, we have shown: 

Theorem 5.1. For a formula (p o/NWTL"*", one can effectively construct a nondetermin- 
istic Biichi nested word automaton of size 2'^(l'^l) accepting the models of (p. 

Since the automaton A^p is exponential in the size of tp, we can check satisfiability 
of if in exponential-time by testing emptiness of A^. ExPTiME-hardness follows from the 
corresponding hardness result for CaRet. 

Corollary 5.2. The satisfiability problem for NWTL"*" is Exptime- complete. 

When programs are modeled by nested word automata A (or equivalently, pushdown au- 
tomata, or recursive state machines), and specifications are given by formulas ip of NWTL"^, 
we can use the classical automata-theoretic approach: negate the specification, build the 
NWA j4^^ accepting models that violate if, take the product with the program A, and 
test for emptiness of L{A) n L(A-,^). Note that the program typically will be given more 
compactly, say, as a Boolean program [5], and thus, the NWA A may itself be exponential 
in the size of the input. 

Corollary 5.3. Model checking NWTL^ specifications with respect to Boolean programs 
is ExPTiME-comp/eie. If the program model is given as a recursive state machine or 
nested word automaton, the running time is polynomial in the model and exponential in 
the NWTL+ formula, and remains ExPTlME-comp/ete. 

5.3. Checking the within operator. We now show that adding within operators makes 
model-checking doubly exponential. Given a formula ip of NWTL or NWTL"*", let p^ be 
a special proposition that does not appear in ip. Let be the language of nested words 
w such that for each position i, {w,i) \= p^p iff {w,i) \= Wip. We construct a doubly- 
exponential automaton B that captures W^p. First, using the tableau construction for 
NWTL^, we construct an exponential-size automaton A that captures nested words that 
satisfy ip. Intuitively, every time a proposition is encountered, we want to start a new 
copy of A, and a state of B keeps track of states of multiple copies of A. At a call, B 
guesses whether the call has a matching return or not. In the latter case, as in case of 
determinization construction for nested word automata |4], we need to maintain pairs of 
states of A so that the join at return positions can be done correctly. A state of B, then, is 
either a set of states of ^ or a set of pairs of states of A. We explain the latter case. The 
intended meaning is that a pair (g, q') belongs to the state of B, while reading position i of a 
nested word w, if the subword from i to the first unmatched return can take the automaton 
A from state q to state q' . When reading an internal symbol a, a summary {q,q') in the 
current state can be updated to {u, q'), provided A has an internal transition from g to u on 
symbol a. Let B read a call symbol a. Consider a summary (q, q') in the current state, and 
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a call-transition {q,a,qi,qh) of A. Then B guesses the return transition {ui,qh,b,u) that 
will be used by A at the matching return, and sends the summary {qi,ui) along the call 
edge and the triple (6, u, q') along the nesting edge. While processing a return symbol 6, 
the current state of B must contain summaries only of the form (g, q) where the two states 
match, and for each summary (6, ti, q') retrieved from the state along the nesting edge, the 
new state contains (u, q'). Finally, B must enforce that Wip holds when is read. Only a 
call symbol a can contain the proposition p^, and when reading such a symbol, B guesses 
a call transition (qQ,a,qi,qh), where qo is the initial state of A, and a return transition 
{ui,qh,b,qf), where is an accepting state of A, and sends the summary {qi,ui) along the 
call edge and the symbol b along the nesting edge. 

Lemma 5.4. For every formula ip of NWTL^, there is a nested word automaton that 
accepts the language W^p and has size doubly- exponential in \ip\. 

Consider a formula if of NWTL"^ + W. For every within-subformula Wg of if, let g' 
be obtained from g by substituting each top-level subformula Wip in g by the proposition 
p^. Each of these primed formulas is a formula of NWTL^. Then, if we take the product 
of the nested word automata accepting Wg/ corresponding to all the within-subformulas g, 
together with the nested word automaton A^/, the resulting language captures the set of 
models of ip. Intuitively, the automaton for Wg' is ensuring that the truth of the proposition 
Pg reflects the truth of the subformula Wg. If g itself has a within-subformula WV', then 
the automaton for g treats it as an atomic proposition p^ , and the automaton checking p^ , 
running in parallel, makes sure that the truth of correctly reflects the truth of Wip. 

For the lower bound, the decision problem for LTL games can be reduced to the satisfia- 
bility problem for formulas with linear untils and within operators jl8j . and this shows that 
for CaRet extended with the within operator, the satisfiability problem is 2ExPTiME-hard. 
We thus obtain: 

Theorem 5.5. For the logic NWTL"*" extended with the within operator W the satisfia- 
bility problem and the model checking problem with respect to Boolean programs, are both 
2EXPTIME- complete. 

Remark: checking w \= ip for finite nested words. For finite nested words, one evaluates the 
complexity of checking whether the given word satisfies a formula, in terms of the length 
\w\ of the word and the size of the formula. A straightforward recursion on subformulas 
shows that for NWTL formulas the complexity of this check is Odw^l • \^\), and for both 
logics with within operators, CaRet + {C,Tl} and LTL^ + W, it is 0(|w;p • \p\). 

5.4. On within and succinctness. We saw that adding within operators to NWTL"*" 
increases the complexity of model-checking by one exponent. Thus there is no polynomial- 
time translation from NWTL^ + W to NWTL^. We now prove a stronger result that gives 
a space bound as well: while NWTL^ + W has the same power as NWTL"*", its formulae 
can be exponentially more succinct than formulas of NWTL^. That is, there is a sequence 
n € N, of NWTL^ + W formulas such that pn is of size 0(n), and the smallest formula 
of NWTL^ equivalent to ipn is of size 2^'^"). For this result, we require nested w-words to 
be over the alphabet 2^^. 

Theorem 5.6. NWTL^ + W is exponentially more succinct than NWTL^. 
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Proof. The proof is based upon succinctness results in [9l [23] , by adapting their examples 
to nested words. 

From the FO completeness of NWTL+, we have that NWTL++W can be translated into 
NWTL+. We show that at least an exponential blow-up is necessary for such translation. 
More precisely, we construct a sequence {ipn}n>i of NWTL"*" + W formulas of size 0{n), 
such that the shortest NWTL"^ formula that is equivalent to is of size 2^^"^ . Our proof 
is a modification of similar proofs given in [9l [23]. Assume S = {qq, . . . , Un}, and let (pn be 
the following NWTL"*" + W formula (here, and 0'^9 are abbreviations for -i(TU°'-i0) 
and respectively): 

□'^('call^ >Vn"('(/\(ai ^ e"(a, A-0T))) ^ (oq ^ ^"(ao A -0T)) 
^ ^ i=i 

It is not hard to see that u) ^ 99^ iff for all positions i,j in w such that holds, if 

position i in j] coincides with i on ai, . . . , o^, then £ also coincides with i on oq. 
It is shown in Theorem 15.11 that for each NWTL"*" formula a, the language 

La = {w \ w is a nested L<j-word such that w \= a} 

is recognized by a nondeterministic nested word automaton of size prove 
the theorem, it is enough to show that every such automaton for is of size 2^"*"' . Let 
A be a nondeterministic nested word automaton for L^^^. Assume that bo, ... , 62"-i is an 
enumeration of the symbols in 2^^'^^"''^. For every K C {0, . . . , 2" — 1} let wk be the word 
Co • • • C2"-i over alphabet 2^, where for each z < 2" — 1: 




bi i e K 

bi U {oq} otherwise 



It is not hard to see that for each K C {0, . . . , 2" — 1}, the nested w-word (u)^, /i), where 
^i = {(i,3 • 2" - j + 1) I 1 < j < 2'^}, is such that {w'f^,^l) h ^n- Let {q}i,p]^,ql) and 
{Qk'^Pk'i Qk') be pairs of states such that (1) there exists an accepting run of A on {wf^, fj.) 
such that A is in state q]^ and has hierarchical symbol p]^ in call position 2", and A is 
in state qj^ in internal position 2-2"; (2) there exists an accepting run of A on {w^,,^) 
such that A is in state q]^, and has hierarchical symbol p\^, in call position 2", and A is 
in state q\^, in internal position 2 • 2". Next we show that {q\,PK,q\) / {qK',PK',qK') 
if K ^ K' . On the contrary, assume that {q\,PK,q\) = {q\c'iPK'^^\:')- Then A accepts 
{wkwk'W^, fi), which leads to a contradiction since {wkwk'w'^, fi) ^ (pn- Given that the 
number of different K^s is 2^ , the latter implies that the number of different triples of states 
and hierarchical symbols of A is at least 2^ . Thus, if m is equal to the number of states 
of A plus the number of hierarchical symbols of A, then > 2^" and, hence, m > 2^" . 
Therefore, the size of A is 2^"*"^ . This concludes the proof of the theorem. □ 




6. Finite- Variable Fragments 

We have already seen that FO formulas in one free variable over nested words can be 
written using just three distinct variables, as in the case of the usual, unnested, words. For 
finite nested words this is a consequence of a tree representation of nested words and the 
three- variable property for FO over finite trees [20], and for infinite nested words this is a 
consequence Theorem 14.11 
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In this section we prove two results. First, we give a model-theoretic proof that FO 
formulas with zero, one, or two free variables over nested words (finite or infinite) are equiv- 
alent to FO^ formulas. Given the FO = FO^ collapse, we ask whether there is a temporal 
logic expressively complete for FO^, the two- variable fragment. We adapt techniques from 
[9] to find a temporal logic that has the same expressiveness as FO^ over nested words (in 
a vocabulary that has successor relations corresponding to the "next" temporal operators). 

6.1. The three-variable property. We give a model-theoretic, rather than a syntactic, 
argument, that uses Ehrenfeucht-Fraisse games and shows that over nested words, formulas 
with at most two free variables are equivalent to FO^ formulas. Note that for finite nested 
words, the translation into trees, already used in the proof of Theorem 14. H can be done 
using at most three variables. This means that the result of |20j establishing the 3-variable 
property for finite ordered unranked trees gives us the 3-variable property for finite nested 
words. We prove that FO = FO^ over arbitrary nested words. 

Theorem 6.1. Over finite or infinite nested words, every FO formula with at most 2 free 
variables is equivalent to an FO^ formula. 

Proof. As we mentioned already, in the finite case this is a direct consequence of |20] 
so we concentrate on the infinite case. It is more convenient for us to prove the result for 
ordered unranked forests in which a subtree rooted at every node is finite. The way to 
translate a nested w-word into such a forest is as follows: when a matched call i with fJ.{i,j) 
is encountered, it defines a subtree with i as its root, and j + 1 as the next sibling (note that 
this is different from the translation into binary trees we used before). If i is an internal 
position, or a pending call or a pending return position, then it has no descendants and 
its next sibling is i + 1. Matched returns do not have next sibling, nor do they have any 
descendants. The nodes in the forest are labeled with call, ret, and the propositions in 
S, as in the original nested word. 

It is routine to define, in FO, relations ^dcsc and ^sib for descendant and younger sibling 
in such a forest. Furthermore, from these relations, we can define the usual < and /x in 
nested words using at most 3 variables as follows. For x < y, the definition is given by 

(y ^dcsc X) V 3z(^X ^desc Z A3x{z ^sib Z Ay ^dcsc x)^ 

and for fi{x, y), by 

(y ^dcsc x) A Vz((z ^dcsc x) 3x{x = z Ax <y)). 

Thus, it suffices to prove the three-variable property for such ordered forests, which will 
be referred to as A, B, etc. We shall use pebble games. Let G'^{A,ai,bi,B,bi,b2) be the 
m-move, f-pebble game on structures A and B where initially pebbles Xi are placed on aj in 
A and bi in B. Player II has a winning strategy for G'^{A, ai, 6i, B, 62) iff A, ai, 02 and 
B,bi,b2 agree on all formulas with at most v variables and quantifier-depth m. We know 
from [13] that to prove Theorem 16. 11 it suffices to show the following. 

Claim 6.2. For all k, if Player II has a winning strategy for the game G3^_,„2(-^; ^i; ^2; 
B,bi,b2), then she also has a winning strategy for the game G^(^, ai, 02; -B, 61, 62)- 

We will show how Player II can win the fc-pebble game by maintaining a set of 3-pebble 
sub-games on which she will copy Player I's moves and decide on good responses using 
her winning strategy for these smaller 3-pebble games. The choice of these sub-games will 
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partition the universe \A\ U \B\ so that each play by Player I in the /c-pebble game will be 
answered in one 3-pebble game. This is similar to the proof that linear orderings have the 
3- variable property [13]. 

The subgames, G^(^, ai, 02; ^B, 6i, 62)) that Player II maintains will all be vertical in 
which a2 ^desc oi and 62 ^desc ^1 hold, or horizontal in which oi ^sib 02 and bi ^sih ^2 
hold. 

The following lemma gives the beginning strategy of Player II in which she replaces 
an arbitrary game configuration with a set of configurations each of which is vertical or 
horizontal. 

Lemma 6.3. // Player II wins Gf^_^^{A, ai, 02; S, 61, 62). Then there are points a'^, a'^ from 
A and h'l, h'2 from B such that Player II wins the horizontal game G^_|_2(w4., a'^, cl^^B, b'l, h'2) 
and the vertical games G^_,_2(^, a[,ai; B, 5-, 6-) for i = 1,2. 

Proof. For this proof since A and B are fixed, we will describe a game only by listing the 
chosen points, e.g., (oi, 02; 61, 62)- We simulate two moves of the game, G^_^^{ai, 02; 61, ^2)) 
in which we choose Player I's moves and then Player II answers according to her winning 
strategy. Let u + v denote the least common ancestor of u and v. First, we have Player 
I place pebble 2:3 on a[, the unique child of ai + 02 that is an ancestor of oi. (Note that 
if a'^ = ai then this move can be skipped and similarly for the second move if a'2 = a2.) 
Player II answers by placing 3:3 on some point b'^. Second, Player I should move pebble xi 
from ai to a'2, the unique child of oi + 02 that is an ancestor of 02- Player II moves xi to 
some point 

Since Player II has moved according to her winning strategy, we have that she still has 
a winning strategy for the three games in the statement of the lemma. Furthermore, since 
a'l and 02 are siblings and we have two remaining moves, b'l and b'2 must be siblings as 
well. □ 

Using Lemma 16.31 we initially partition the universe according to four subgames: 

• {or , Op] br , bp) with domain everything not below ap or bp. Here Op = oi + a2, i.e., the 
parent of a'^, bp = 61 + 62) i-e., the parent of b'^ and Oj. and br are the roots of A and 
B, (the roots are not necessary but then the subgames are all on horizontal or vertical 
pairs), or 

• {a'l, ai; b'^, bi) with domain everything below a'l or b[, 

• (02, (22 ; ^2, 62), with domain everything below 02 or 62) 

• {a'l, a'2; b'l, b'2), with the remaining domain. 

We now have to explain, inductively, how all moves of Player I in the fc-pebble game 
are answered by Player II and how, in the process, the universe is further partitioned. We 
inductively assume that Player II has a winning strategy for each of the 3-pebble, m-move 
sub-games. There are two cases: 

Vertical: Player I places a new pebble on a point a that is in the domain of a vertical 
game: (oi, 02; 61, 62)- We thus know that ai is a proper ancestor of a. The interesting 
case is where neither of a and 02 is above the other so, without loss of generality, assume 
that a < 02- We place X3 on a'2, the child of a + 02 that is above 02. Let Player II move 
according to her winning strategy, placing X3 on some point b'2. We split the original game 
into (ai, 02; bi, b'2) and {a'2, 02; ^2, 62) so Player II has a winning strategy for these 3-pebble, 
m — 1 move sub-games. Next, in the (ai, 02; 61, 62) game we place X3 on Op, the parent of 
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02 and we let Player II answer according to her winning strategy, placing X3 on some point, 
bp. We then split off the game (oi, a^; bi, bp). 

Returning to the game (ai, ^'i, ^2)' have Player I place X3 on a', the sibling of 03 
above a, and let Player II answer according to her winning strategy, placing on some 
point, b'. 

Finally, we let Player I move xi to a, and let Player II reply with xi on some point b. 

The sub-games are thus: (oi, a^; 61, bp), (a', b', 62), (a', o; 6', 6), and (03, 02; ^21 ^2) and 
Player II has winning strategies for the G^_3 game on all of them. 

Horizontal: In this case, we have the configuration, (oi, 02; 61, 62), consisting of a pair 
of siblings. The only interesting case occurs when Player I puts a new pebble on some 
vertex, a, s.t. ai < a < 02. In this case, we have Player I place pebble X3 on a', the sibling 
of ai above a. Player II will place pebble X3 on some vertex, 6', which must be a sibling of 
61 and 62- 

Next, in the game below a' and 6', we let Player I place pebble X2 on a and we let 
Player II answer according to her winning strategy in this game, placing X2 on some vertex, 
b. The domain of the original configuration is thus split into domains for three sub-games: 
{ai,a';bi,b'), (a', 02; 6', 62)5 and {a' , a; b' , b) . On each of these. Player II has a winning 
strategy for the 3-pebble, m — 2 move game. 

We now complete the proof that Player II wins G^(ai, 02; 61, 62)- Whenever Player I 
places a new pebble on some point, say a, in the original game. Player II will answer as 
described above, i.e., in one of the little games we will have Player II wins 63^(0, a'; 6, 6') 
where there are r moves remaining in the big game. 

Player II then answers in the big game by placing the corresponding pebble on b. To 
see that the resulting moves are a win for Player II, we must just consider any two pebbled 
points, Oj, aj G A, and 6^, bj G B. If they came from the same sub-game, then they agree on 
relations ^dcsc)~<sib because Player II wins the sub-game. Otherwise, ai,bi came from one 
sub-game, Gj, and aj,bj came from another sub-game, Gj. By our choice of the domains 
and transitivity of ^dcso^sibj it thus follows that ai,aj stand in the same relation with 
respect to ^dcsc,-<sib as do. 

6.2. The two-variable fragment. In this section, we construct a temporal logic that 
captures the two-variable fragment of FO over nested words. Note that for finite unranked 
trees, a navigational logic capturing FO^ is known |2H I20j: it corresponds to a fragment of 
XPath. However, translating the basic predicates over trees into the vocabulary of nested 
words requires 3 variables, and thus we cannot apply existing results even in the finite case. 

Our temporal logic will be based on several next and eventually operators. Since FO^ 
over a linear ordering cannot define the successor relation but temporal logics have next 
operators, we explicitly introduce successors into the vocabulary of FO. These successor re- 
lations in effect partition the linear edges into three disjoint types; iniehor edges, call edges, 
and return edges, and the nesting edges (except those from a position to its linear successor) 
into two disjoint types; call-return summaries, and call-interior-return summaries. 

• S^{i,j) holds iff J = i -|- 1 and either ^{i,j) or i is not a call and j is not a return. 

• S^{i,j) holds iff i is a call and j = i -|- 1 is not a return; 

• S^{i,j) holds iff i is not a call and j = i -|- 1 is a return. 

• S'^^{i,j) holds iff ^{i,j) and there is a path from i to j using only call and return edges. 

• S'^^^{i,j) holds iff ^{i,i) and neither j = i + \ nor S^^{i,j). 
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Let T denote the set {c, i, r, cr, cir} of all edge types. In addition to the built-in predi- 
cates S** for t € T, we add the transitive closure of all unions of subsets of these relations. 
That is, for each non-empty set F C T of edge types, let stand for the union Ut^rS^, 
and let be the reflexive-transitive closure of . Now when we refer to FO^ over nested 
words, we mean FO^ in the vocabulary of the unary predicates plus all the <'"'s, the five 
successor relations, and the built-in unary call and ret predicates. 

We define a temporal logic unary-NWTL that has future and past versions of next 
operators parameterized by edge types, and eventually operators parameterized by a set of 
edge types. For example, O^'^^ means eventually along a path containing only call edges. 
Its formulas are given by: 

(fi := T I a I call j ret j -k^ | (pM ip' \ 

OV I eV I o^v I "^^v 

where a ranges over S, t ranges over T, and F ranges over non-empty subsets of T. The 
semantics is defined in the obvious way. For example, {w,i) \= '^^ip iff for some position 
J, i ^ j and {w,j) \= tp; {w,i) \= C) P for some position j, S^{i,j) and {w,j) \= ip; and 
(Wji) \= call iff call(i) holds in id. 

For an FO^ formula p{x) with one free variable x, let qdp{p) be its quantifier depth, 
and for a unary-NWTL formula ip', let odp{ip') be its operator depth. 

Theorem 6.4. 

(1) unary-NWTL is expressively complete for FO^ over nested words. 

(2) If formulas are viewed as DAGs (i.e identical subformulas are shared), then every 
FO^ formula (p{x) can be converted to an equivalent unary-NWTL formula p' of size 
20(\'p\(qdp{ip)+i)) odp(99') < 10 qdp((/9). The translation is computable in time poly- 
nomial in the size of ip' . 

(3) Model checking of wnary-NWTL can be carried out with the same worst case complexity 
as for NWTL. 

Proof. The translation from unary-NWTL into FO^ is standard and can be done with 
negligible blow-up in the size of the formula, so we concentrate on the other direction. The 
proof generalizes the proof of an analogous result for unary temporal logic over words from 

Given an FO^ formula ip{x) the translation procedure works a follows. When ip{x) is 
atomic, i.e., of the form a(x), it outputs a. When p{x) is of the form tpi V '02 or ^tp — we 
say that p{x) is composite — it recursively computes ip'^ and ip'2, or ip' and outputs ^p'l V ip'2 
or -1-0'. The two cases that remain are when ip{x) is of the form 3xip*{x) or 3yip*{x,y). In 
both cases, we say that p{x) is existential. In the first case, ip{x) is equivalent to 3yip*{y) 
and, viewing 2; as a dummy free variable in ip*{y), this reduces to the second case. 

In the second case, we can rewrite p*{x,y) in the form 

y) = P{xo{x, y), .., Xr-i {x, y),^o{x), .., 6-1(2;), Co (y), .., Ct-i{y)) 

where /3 is a propositional formula, each formula Xi is an atomic order formula, each formula 
6 is an atomic or existential FO^ formula with qdp(6) < qdp((/3), and each formula d is an 
atomic or existential FO^ formula with qdp(Cj) < qdp{p). 

In order to be able to recurse on subformulas of ip we have to separate the 6's from the 
Ci's. We first introduce a case distinction on which of the subformulas 6 hold or not. We 
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obtain the following equivalent formulation for (/?: 

7e{T,±}» i<s 

We proceed by a case distinction on which order relation holds between x and y, where 
X < y. We consider mutually exclusive cases, determined by the following formulas, which 
we call order types. 

• ^'o is X = y. 

• For each t ^T, '^t is 5'*(x, y). 

• For each t eT, is 3z {S\x, z) A z <* y). 

• Let o = ti,t2, ■ ■ ■ tk he a sequence over T such that 2 < A; < 5, all tj's are distinct, and a 
call never appears before return (that is, if t j = c then tj ^ r for j > i). Then stands 
for 

3zi,zi,Z2,4,...Zfc {S*^x,zi)Azi z[AS'^z[,Z2) Az2 z'^ A ■ ■ ■ A Zk <^'= y) 

where for 1 < i < k, the set Tj equals the set {ti,t2 ■ ■ ■ ti}, but with r removed if both c 
and r belong to this set. 

We claim that these order types are mutually exclusive and complete, and are expressible 
in unary-NWTL (and hence, in FO^). First, let us show that the order types form a 
disjoint partition, meaning for all pairs {x, y) such that x < y, we have exactly one of these 
relationships holding true. To see this, suppose x < y. Then either S^{x,y) holds for some 
type t (and the successor relations are disjoint, for distinct t's), or there is a path from 
X to y that uses at least two edges. The key observation is that a path from x to y is a 
summary path iff the path does not contain a call edge followed later by a return edge. Also, 
there is a unique summary path from x to y. We can now classify the paths by the edge 
types that this unique summary path contains, and the order in which they first appear in 
the path. For example, ^d-c, y) holds when there is a path from x to y using 2 or more 
call edges; ^c,cirix, y) holds when there is a path from x to y which begins with a call edge, 
uses at least one call-interior-return summary edge, and uses only these two types of edges; 
^r,i,c{x, y) holds when there is a path from x to y that can be split into three consecutive 
parts: a part containing only return edges, a part containing at least one internal and only 
internal and return edges, and a part containing at least one call and only call and internal 
edges. Note that some of these order types are empty: for example, two summary edges 
can never follow one another, and hence ^cr{x-, u) can never hold. Emptiness of some of the 
order types is not relevant to the proof. 

When we assume that one of these order types is true, each atomic order formula 
evaluates to either T or _L, in particular, each of the Xi's evaluates to either T or _L; we will 
denote this truth value by xj- For example, when \I'cr(x,y) holds then (1) S^{x,y) is true 
for t = cr and false for t ^ cr, and (2) is true if F contains cr or if F contains both c 
and r, and false otherwise. 

We can finally rewrite if as follows, where T stands for the set of all order types: 

7e{T,±}= i<s rex 

If r is an order type, ^/;(x) an FO^ formula, and V' an equivalent unary-NWTL formula, 
there is a way to obtain a unary-NWTL formula T{ip) equivalent to 3y(r A'i/'(y)), as follows. 
Assume that x < y. 
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• For the order type ^o, t{tP') is "0' itself. 

• For each t G T, for the order type ^'t, t{tP') is O*'0'- 

• For each t eT, for the order type T{ip') is O^O*^''^*-^^'- 

• For order type ^'o, where o = ti,t2, ■ ■ - tf. is a sequence over 

0"^'=7/;', where for 1 < i < A; < 5, the set Tj equals the set {ii, t2 • • • ti}-, but with r removed 
if both c and r belong to this set. 

The case corresponding to past operators is analogous. Our procedure will therefore recur- 
sively compute for i < s and C,i{x) for i < t and output 

V (A(^^^^^)^ \l r{P{xl,...xl-i,l,a^),---,C[-i{x)))) . (6.1) 

7e{T,±}4<s rex 

Now we verify that \(p'\ and odp(y5') are bounded as stated in the theorem. Note that 
the size \(p'\ is measured by viewing the unary-NWTL formula as a DAG, i.e., sharing 
identical subformulas. That odp((/3') < 10qdp((/?) is easily seen from the operator depth 
in the translation table above. The proof that \ip'\ < 2^^'^^^'^'^^^'^^'^^^ for some constant c is 
inductive on the quantifier depth of ip. The base case is trivial, and the only interesting 
case in the inductive step is when ip is of the form 3y(f*{x,y) as above. In this case, we 
have to estimate the length of (j6.1|) . There are 2** < 21'^' possibilities for 7 in (16. and 
each disjunct in (j6.ip has length at most d\{p\ maxi^s,j<ti\d\,\Cj\) for some constant d. 
By induction hypothesis, the latter is bounded by d\(f\ 2'^I'^I'i*^p('^), which implies the claim, 
provided c is chosen large enough. 

It is straightforward to verify that our translation to ip' can be computed in time 
polynomial in \(p'\. 

Model checking of unary-NWTL can be achieved with the same complexity as for 
NWTL using a variant of the tableaux construction in Section [5l □ 

7. Conclusion 

We have provided several new temporal logics over nested words and shown that they are 
first-order expressively complete. We have furthermore shown that first-order logic over 
nested words has the three-variable property, and we have also provided a temporal logic 
over nested words that is complete for two-variable first-order logic. We have shown, via an 
automata-theoretic approach based on nested word automata, that satisfiability for the logic 
NWTL"*" is EXPTIME-complete, and that model checking runs in time polynomial in the 
size of the RSM model and exponential in the size of the formula. When the within modality 
is added to NWTL, the complexity of model checking becomes doubly exponential. We note 
that it remains open whether the original temporal logic CaRet, proposed for nested words 
in [2], is first-order complete, but we conjecture that it is not. 
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Appendix A. Proof of Lemma 14.31 

For translating each NWTL'^* formula into an equivalent NWTL formula 13^, we need to 
consider only the case of until/since operators. The formula ipV^gO is translated into 



Pe V A 



V ret) A (-mcall ^ 0/3^) A (mcall ^ {Of,OPi, V Of^OPe))^ V 
{P^ V ret) A (^mcall OPe) A 



(mcall ^ iOPe V O^^OPe V 0(-ret A 7))) J J j , (A.l) 
where 7 is a formula defined as follows: 

{P^ V ret) A (^mcall ^ 0/3^) A (mcall ^ {OP^, V 0^0/3^)) A (Oret ^ call)^ U'" 
(/3^ V ret) A (-mcall ^ 0/3e) A (mcall ^ (O/?^ V 0^,OPe)) 



The proof that the translation is correct is by induction on the structure of NWTL'^* for- 
mulas. Again we need to consider only the case of until/since operators. Assume that "0) 
9 are equivalent to and Pq, respectively. We need to prove that ipV^gO is equivalent to 

dsn). 

(<^=) We first show that if {w,i) satisfies (jA.ip . then {w,i) \= tpU'^^O. Given that {w,i) 
satisfies ()A.ip . either {w,i) \= Pq or {w^i) satisfies the second disjunct of ()A.ip . Since P0 
and 9 are assumed to be equivalent, in the former case (w), i) \= tpTJ^gO. Thus, assume that 
the latter case holds. Then {w, i) \= ip, since ijj and P^ are equivalent, and there exists a 
summary path i = iq < ii < ■ ■ ■ < ip such that: 

{w,ik) h (/3v. V ret) A (-mcall ^0/3^) A (mcall ^ (0^.0/3^ vO,.0/3e)), < fc < p, (A.2) 
(w,ip) h (/3v; Vret) A(-mcall^0/3e) A(mcall^ (0/3eVO^O/3e vO(-ret A7))). (A.3) 
We consider three cases. 

(I) Assume that there exists a position {k G [0,p — 1]) such that is a matched call 
position and {w,ik) \= O^OPe, and let iq {q G [0,p — 1]) be the first such position. 
Then only one semi-strict path with endpoints i = io and r{iq) + l can be obtained from 
the sequence io < ii < ■ ■ ■ < ig < r{iq) -|- 1 by removing all positions i^ (with k G [1, q]) 
such that ik-i is a matched call position and r{ik_i) = i^; let io = jo < ji < ■ • • < je = 
r{iq) -|- 1 be that semi-strict path. Next we show that: 

(wjk) N V' 0<k<i, 
{w,ji) \= 9, 

from which we conclude that {w,i) \= ■ip\Jgg9. 

Given that ji = r{iq) + 1, {w-iiq) \= OfiOPe and we assume that 9 and Pg are 
equivalent, we have that {w,ji) \= 9. Next we show that {w,jk) \= ip for every k G 
[0,£ — 1]. If A; = 0, then the property holds since {w,i) \= ip and we assume that ip 
and P^ are equivalent. Assume that k £ [1,^ — 1]. If j^ is not a return position, then 
{w-,jk) \= Ip since {w,jk) |= V ret (recall that jk is a position in the summary path 
io < ii < • ■ ■ < iq since k < i). If jk is a return position, then we have to consider 
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two cases. If jfc-i = jk — 1) then we have that — 1 is not a call position since jk is 
a return position, jo < ji < • ■ ■ < je is a semi-strict path and k — 1 G [0, ^ — 2]. Given 
that jk-i is a position in the summary path iq < ii < ■ ■ ■ < iq-i, we conclude that 
{w^jk ~ 1) N "'incall — Op^. Thus, from the fact that jk-i is not a call position, 
we conclude that (w^jk) \= (3^- Hence, (wjjk) \= "ip- Otherwise, jk-i / jk — 1> and we 
conclude that jk-i is a matched call position and jk = r{jk^i) + 1. Thus, since ig is the 
smallest one satisfying O^O/^e and jk-i < iq, and we know from ()A.2p that {w,jk-i) \= 
mcall (0^0/3^ V Ofj,OPe), we see that {iv,jk-i) \= mcall OfiOP^p and, since 
jk-i is a matched call, we conclude that (wjjk) \= Pip and, therefore, {w,jk) \= tp- 
(II) Assume that condition (I) does not hold, and also assume that either ip is not a matched 
call position or ip is a matched call position and {w, ip) \= OPe V O^iOPo- Then given 
that {w,ip) \= -mcall — > OPe, we have that there exists a position ip^i > ip such that 
(u;,ip+i) \= Pq and ip+i is either ip + 1 or r[ip) + 1. Only one semi-strict path with 
endpoints i = io and ip+i can be obtained from the sequence io < ii < ■ ■ ■ < ip < ip+i 
by removing all positions i^ (with k G such that ik-i is a matched call position 

and r(zfc_i) = ik] let io = io < ji < ■ • • < = ip+i be that semi-strict path. Next we 
show that: 

{w,jk) N V' 0<k<£, 

{w,ji) \= e, 

from which we conclude that {w,i) \= ipTJ^gO. 

Given that (iZ;,ip+i) \= Pq and the hypothesis that and Pq are equivalent, we have 
that (w, ji) \= 0. Next we show that (zZ), j'fc) \= ip for every A; G [0, l—l]. lik = 0, then the 
property holds since {w, i) \= ip. Assume that k G [1,^ — 1]. If is not a return position, 
then {w,jk) \= Ip since {w,jk) \= P^V ret (recall that jk is a position in the summary 
path if) < ii < ■ ■ ■ < ip). If jk is a return position, then we have to consider two cases. 
If jk~i = jk — 1, then we have that jk — 1 is not a call position since io < Ji < ' ' • < je 
is a semi-strict path and k — 1 G [0,^ — 2]. Thus, given that jk-i is a position in the 
summary path < ii < • • • < ip-i, we have that {w,jk — 1) ^ -imcall — > OP^, from 
which we conclude that {w,jk) \= Pip. Hence, {w,jk) \= ip- Otherwise, jk~i jk — 1, 
and we conclude that jk-\ is a matched call position and jk = r{jk-i) + 1. Thus, given 
that condition (I) does not hold, we have that {w,jk~i) \= mcall — > 0/iO/3i/) (since 
jk-i is a position in the summary path io < < ' ' • < ip-i and (w,jk~i) Y= OfiOPe)- 
Thus, given that jk-i is a matched call, we conclude from (lA.2j) that {w,jk) \= Pip and, 
therefore, {w,jk) \= ip- 

(III) We now look at the remaining cases, that is, condition (I) does not hold, ip is a matched 
cah, and {w,ip) \= ^{OPq V O^OPe)- By (|A.3j) . this implies {w,ip) \= 0{^zet A 7). 
From {w, ip) \= O7, we see that there exists a summary path ip+1 = ip+i < ip+2 ■ ■ ■ < iq 
such that: 

(w,ifc) h (/?v> Vret)A(-mcall ^ 0/3^)A 

(mcall ^ (0/3,/. V O^OP^,)) A (Oret ^ call) p+l<k <q, 

{w,ig) \= (/3^Vret)A(-mcall^O/3e)A(mcall^(0/3eVO^O/3e)). 

We first show that iq < r{ip). Assume to the contrary that iq > r{ip). Since the first 
position on the path is inside the call ip, there exists /c G [p + 1, 9] such that ik = r{ip). 
Given that ip+i is not a return position (since {w, ip) \= -iQret), we have that q > p + 1 
and, therefore, ifc — 1 is also a position in the summary path ip_|_i < ip_|_2 ... < iq. But 
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given that = r{ip) is the matching return of ip and + 1 < ifc — 1, we have that 
ifc — 1 is not a call position. Thus, {w,ik — 1) ^ O^et call, which contradicts the 
fact that ip+i < ip+2 ■ ■ ■ < iq witnesses formula 7. Therefore indeed iq < r{ip). 

Given that {w,iq) [= (^mcall ^ 0[3e) A (mcall ^ {0[5e V 0/.0/3e)), we conclude 
that there exists a position ig+i > iq such that {w, iq+i) \= Pg and ig+i is either + 1 or 
r{iq) + 1. Only one semi-strict path with endpoints i = io and ig+i can be obtained from 
the sequence io < ii < ■ ■ ■ < iq < iq+i by removing all positions ik (with k G [!,<?]) such 
that ik-i is a matched call position and r(zfc_i) = i^; let io = Jo < Ji < ' ' ' < = iq+i 
be that semi-strict path. Next we show that: 

(wjk) N V' 0<k<£, 

{w,je) \= 0, 

from which we conclude that {w,i) \= ipTJ^^O. 

Given that \= Po, we conclude that that {w^je) \= 0. Next we show that 

{iJ^jjk) \= foi' every k € [0,^ — 1]. If = 0, then the property holds since {w,i) \= ip 
and we assume that ip and are equivalent. Assume that A; € — 1]. If jk is 
not a return position, then {w,jk) \= ip since {iv,jk) \= P^, V ret (recall that is a 
position in the sequence io < ii < ■ ■ ■ < iq). If jk is a return position, then we need 
to consider two cases. If jk-i = jk ~ then we have that jk — 1 is not a call position 
since Jo < Ji < ' ' ' < ji is a semi-strict path and k — 1 € [0,^ — 2]. Thus, given that 
jk-i is a position in the sequence iq < ii < • • • < iq^i and jk-i 7^ ip (since ip is a 
call position), we have that {w,jk — 1) |= -uncall 0/3t/), from which we conclude 
that (wjk) h Pij- Hence, (wjk) \= V'- If j'fe-i / Jfc - 1, then we have that j^-i is a 
matched call position and jk = r{jk~i) + 1- Moreover, in this case we also have that 
jk < ip- Indeed, to see this, assume to the contrary that ip < jk- Then given that 
iq < r{ip), we know that Zg+i < r{ip). Thus, given that ip is a call position, k < i 
and ji = iq+i, we conclude that ip+i < jk < iq- Therefore, given that jk is a return 
position and ip+i <■■■< iq is a summary path, there exists s € [p + l,q] such that ig 
is a call position with matching return jk- But since jk-i and jk are both positions in 
the summary path ip+i < ■ ■ ■ < iq and jk = r{jk~i) + 1, we conclude that this path 
contains three positions a, h and c such that a < h < c and c is the matching return 
of call position a, which contradicts the definition of summary path. So we proved 
jk < ip- Now we have that (u), \= mcall {OfiOPtp V Ofj,OPg), from which 

we conclude that {iD,jk) \= Pi(, since condition (I) does not hold and jk = r{jk-i) + 1- 
Hence, {w,jk) \= ip- 

(=^) We now show that if (w,i) |= ipTJggO, then (w,i) satisfies (jA.ip . Given that {w,i) \= 
ijjTJggO, there exists a semi-strict path i = io < ii < • • • < iq such that: 

{id,ik) h V' 0<k<q, (A.4) 
{w,iq) h 6- (A.5) 

Notice that if (7 = 0, then (tD,i) ^ and, therefore, {w,i) satisfies the first disjunct of (jA.ip 
since 6 and are assumed to be equivalent. Thus, we suppose that q > 0, and we consider 
two cases. 

(I) Assume that there exists k G [0, q—1] such that ik is a matched call position, ik+i = 

and ik+i is not a return position, and let ip be the first such position. Then only one 
summary path with endpoints i = io and ip can be obtained from the semi-strict path 
io < ii < ■ ■ ■ < ip hy adding positions r{ik) for every k £ [0,p — 1] such that ik is a 
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matched call position and ik+i = r{ik) + 1; let «o = io < Ji < ■ • • < jt = ip be that 
summary path. Next we show that: 

{w,jk) h {P^ V ret) A (^mcall ^ 0/3^) A (mcall ^ Of,OM < k < £, 

{w, ji) \= [13^ V ret) A (^mcall ^ 0/3e) A (mcall ^ [Ofie V 0^0/3e V 0(-ret A 7))), 
from which we conclude that (tD,i) satisfies (jA.ip . 

We start by showing that the first condition above holds. Let A; € [0,^ — 1]. If jk is 
a return position, then we have that {u},jk) \= {(^■^ V ret). Otherwise, by definition of 
jo < • • • < j£, we have that jk is a position in the semi-strict path iq < ii < ■ ■ ■ < ip-i. 
Thus, from ()A.4p we conclude that {w,jk) \= and, hence, {w,jk) \= {(3^, V ret) 
since ^p and are assumed to be equivalent. It only remains to show that {w,jk) \= 
(-imcall OPij)) A (mcall 0^0/3^)- If jk is a matched call position, then by 
definition of ip we have that and r(jfc) + 1 are both positions in the semi-strict path 
io < ^1 < ■ • • < ip-i- Thus, from ()A.4p we conclude that {w,r{jk) + 1) |= and, 
therefore, {w,jk) \= (mcall OfiOPip)- If jfc is not a matched call position, then we 
have that jk + 1 is a position in the semi-strict path iq < ii < ■ ■ ■ < ip. Thus, from 
()A.4p we conclude that {w,jk + 1) H ^ and, therefore, {w,jk) \= (^mcall — > Op.^,). 

We now show that the second condition above also holds. Given that ji = ip is & 
matched call position, we have to prove that (wjji) \= (3^ A {Q)j3e V O^OPe V C){^ret A 
7)). Given that {w,ip) \= ip and we assume that ip and are equivalent, we have 
that (wjji) \= P^. If q = p + 1, then given that {w,iq) \= 9 and we assume that 9 
and Pe are equivalent, we conclude that (w^ji) \= OPe. Thus, assume that q > p + 1. 
Next we show that {w,je) \= C){^i:et A 7) in this case. Given that ip^i = ip + 1 and 
2p_l_i is not a return position, we have {w^ip + 1) ^ -iret, and it only remains to prove 
that {w,ip + 1) ^7. Given that q > p + 1, only one summary path with endpoints 
ip + 1 and iq-i can be obtained from the sequence ip + 1 = ip+i < ip+2 < ■ ■ < iq-i 
by adding positions r{ik) for every A; G [p -\- \,q — 2] such that i^ is a call position and 
ik+i = r{ik) + 1; let ip -|- 1 = So < < ■ • • < -Sm = iq-i be that summary path. Next 
we show that: 

(w, Sk) h iP^ V ret)A(-mcall ^ 0/3^-)^ 

(mcall ^ (0/3^ V 0^0/3^)) A (Oret ^ call) < fc < m, 

(w, s,n) h V ret)A(-mcall ^ Ope) A (mcall ^ {Ope V 0^.0/3e)). 

from which we conclude that {w,ip + 1) [=7. 

We start by showing that the first condition above holds. Let A: G [0,m — 1]. If is 
a return position, then we have that {w,Sk) \= {P^ V ret). Otherwise, by definition of 
sq < . . . < Sm, we have that is a position in the semi-strict path ip+i < ip+2 < • • • < 
iq^i. Thus, from ()A.4|) we conclude that {w,Sk) \= ip and, hence, {w,Sk) \= {P^i, V ret) 
since ip and P^ are assumed to be equivalent. It only remains to show that: 

(u;, Sk) h (-mcall ^ Qp^) A (mcall ^ {OPtp V 0^,OPt^)) A (Oret ^ call). 

If Sk is a matched call position, then Sk is a position in semi-strict path ip+i < ip+2 < 
• • • < and either (a) Sk+i = Sfc -|- 1 and Sk + 1 is a position in the semi-strict path 
ip+i < ip4-2 < • • • < iq-i, or (b) Sfc+i = r(sfc) and r(sfc) -|- 1 is a position in the semi- 
strict path ip_|_i < ip+2 < • • • < iq~i- In the former case, from (1A.4[) we conclude that 
(w, Sfc -|- 1) 1= and, therefore, (tD, Sk) \= mcall Q)P^- In the latter case, from ()A.4p 
we conclude that (u), r{sk) + 1) \= ip and, therefore, (u), Sk) \= mcall — > O^OPip- Thus, 
if Sk is a matched call position, then {w,Sk) \= mcall (0/3»/i V O^OPip)- Assume 
now that Sk is not a matched call position. Given that zq < • • • < ip < ip+i < • • • < iq 
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is a semi-strict path, ip is a matched call position, ip+i = ip + 1 and ip + 1 is not 
a return position, we have that iq < r(ip). Thus, given that < iq, we have that 
ip < Sk < '''{ip), which implies that is either an internal position or a return position. 
Therefore, s^ + l is a position in the semi-strict path ip^i < ip^2 < • • • < iq-i and, thus, 
from ()A.4p we conclude that {w,Sk + 1) ^ ^z^. Hence, |= -uncall — > 0/3^, and 

it only remains to prove that {w,Sk) \= Oret — > call. On the contrary, assume that 
{w, Sfc) 1= Oret and {w, s^) ^ call. Given that is not a call position and < iq-i, 
we have that + 1 is a position in the semi-strict path ip^i < • • • < iq-i- Thus, given 
that {iB,Sk) \= Oret, we conclude that there exists a return position in the sequence 
ip + 1 < ■ ■ ■ < iq-i. But this leads to a contradiction since from the fact that iq < r{ip), 
we can conclude that none of the elements ip+i, . . ., iq-i is a return position. 

To conclude this part of the proof, we need to show that the second condition above 
holds, that is, {w, s„) ^ {(3^ V ret) A (^mcall ^ 0(3e) A (mcall ^ {0(3g V 0^,0|3e)). 
Given that Sm = iq-i, we have that {w, Sm) \= and, therefore, (u), Sm) \= Ptp^/ i"et. It 
remains to show that {w,Sm) \= (-■mcall O/^e) A (mcall — {OPg\/ O^OPe))- Given 
that iq < r{ip), we know that Sm = iq-i is not a return position. If Sm is an internal 
position, then iq = + 1 and, thus, {w,Sm) \= -■mcall — > 0/3e since {w,iq) \= 9. If 
Sm is a call position, then Sm has a matching return and either iq = + 1 or ig = 
r(zg_i) + 1. In the former case, we have that (tD, Sm) \= mcall OPe since {w, iq) \= 9. 
In the latter case, we have that {w,Sm) \= mcall — > 0^jO/3e since {w,iq) \= 9. Hence, 
we conclude that {w, Sm) \= mcall — > (O/^e V OfiOPe)- 
(II) Assume that condition (I) does not hold, that is, assume that there is no /c G [0, q — 
1] such that ik is a matched call position, ik+i = ifc + 1 and ik+i is not a return 
position. Then only one summary path with endpoints i = io and iq can be ob- 
tained from the semi-strict path < «i < • • • < by adding positions r(ifc) for 
every k G [0, g — 1] such that ik is a matched call position and ik+i = r{ik) + 1; 
let io = io < ji < • ■ ■ < ji = iq be that summary path. Next we show that: 
{w,jk) h {fi^ V ret) A (^mcall ^ O/?^) A (mcall ^ (0^0/3^, V O^OPe)) < fc < ^ - 1, 
(w, j£_i) h [P^ V ret) A (^mcall ^ 0/3e) A (mcall ^ (O/?^ V OpO/3e V 0(-ret A 7))), 
from which we conclude that {w,i) satisfies (lA.ip . 

We start by showing that the first condition above holds. Let A; G [0,£ — 2]. If jk is 
a return position, then we have that {w,jk) \= {P^ V ret). Otherwise, by definition of 
Jo < . . . < ji, we have that jk is a position in the semi-strict path io < < • • • < iq-i- 
Thus, from (IA.4P we conclude that {w,jk) \= ip and, hence, {w,jk) \= {P%i> V ret) 
since and P^ are assumed to be equivalent. It only remains to show that {ui,jk) \= 
(-■mcall — i- OPi(,) A (mcall — > (0/^0/?^ V O^OPe))- jk is a matched call position 
and k < i — 2, then given that io < ii <■■■< iq is a semi-strict path and condition 
(I) does not hold, we have that jk and r{jk) + 1 are both positions in the semi-strict 
path <«!<••• < iq~i- Thus, from ()A.4p we conclude that {w,r{jk) + 1) ^ ^ and, 
therefore, {w,jk) \= mcall Of_iOPi()- If jk is a matched call position and k = £ — 2, 
then given that io < ii < • • • < is a semi-strict path and condition (I) does not hold, 
we have that ji^i = r{jk) and iq = = r{jk) + 1. Thus, given that {w,iq) \= 9, we 
conclude that {w,jk) \= mcall — > O^OPe- Finally, if jk is not a matched call position, 
then we have that jk + 1 is a position in the semi-strict path zq < ii < ■ ■ ■ < iq-i 
(since k < i — 1). Thus, from ()A.4p we conclude that {w,jk + 1) |= V and, therefore, 
(wJk) \= ^mcall OP^. 



44 



R. ALUR, M. ARENAS, P. BARCELO, K. ETESSAMI, N. IMMERMAN, AND L. LIBKIN 



To conclude the proof of the lemma, we show that the second condition above also 
holds, that is, (wje^i) \= (/3^ V ret) A (^mcall OPe) A (mcall ^ (O/?0 V 0^0/3e V 
Q){-^ret, A 7))). If ji-i is a return position, we immediately conclude that (tD, |= 
V ret). Thus, assume that is not a return position. But in this case we 
conclude that is a position in the semi-strict path iq < ii < ■ ■ ■ < iq-i and, thus, 
{w,ji^i) \= [13^ Vret) since (tD, j£_i) \= ip and we assume that ^ and are equivalent. 
It only remains to show that: 

{w,je-i) N (-mcall ^ 0/3e) A (mcall ^ {OPe V O^DPe V 0(-ret A 7))). 

If ji-i is a matched call position, then given that condition (I) does not hold, we have 
that iq = ji = r{ji-i) = ji^i + 1. Thus, given that {w,iq) \= 9 and we assume that 
9 and (3e are equivalent, we conclude that {w,jii-i) \= C)(3g and, therefore, {w,ji-i) \= 
mcall — > OPe- If j£-i is not a matched call position, then we have that iq = ji = 
ji-i + 1. Thus, given that {w,iq) \= 9, we have that {w,ji-i) \= Q)Pe and, therefore, 
(u), j^_i) \= -imcall — > 0/?6»- This concludes the proof of Lemma □ 
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